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Abstract 



It is common to view programs as a combination of logic and control: the logic part 
defines what the program must do, the control part - how to do it. The Logic Program- 
ming paradigm was developed with the intention of separating the logic from the control. 
Recently, extensive research has been conducted on automatic generation of control for 
logic programs. Only a few of these works considered the issue of automatic generation of 
control for improving the efficiency of logic programs. In this paper we present a novel al- 
gorithm for automatic finding of lowest-cost subgoal orderings. The algorithm works using 
the divide-and-conquer strategy. The given set of subgoals is partitioned into smaller sets, 
based on co-occurrence of free variables. The subsets are ordered recursively and merged, 
yielding a provably optimal order. We experimentally demonstrate the utility of the algo- 
rithm by testing it in several domains, and discuss the possibilities of its cooperation with 
other existing methods. 

1. Introduction 

It is common to view programs as a combination of logic and control (Kowalski, 1979). The 
logic part defines what the program must do, the control part - how to do it. Traditional 
programming languages require that the programmers supply both components. The Logic 
Programming paradigm was developed with the intention of separating the logic from the 
control (Lloyd, 1987). The goal of the paradigm is that the programmer specifies the logic 
without bothering about the control, which should be supplied by the interpreter. 

Initially, most practical logic programming languages, such as Prolog (Clocksin & Hel- 
lish, 1987; Sterling & Shapiro, 1994), did not include the means for automatic generation of 
control. As a result, a Prolog programmer had to implicitly define the control by the order of 
clauses and of subgoals within the clauses. Recently, extensive research has been conducted 
on automatic generation of control for logic programs. A major part of this research is con- 
cerned with control that affects correctness and termination of logic programs (De Schreye 
& Decorte, 1994; Somogyi, Henderson, & Conway, 1996b; Cortesi, Le Charlier, & Rossi, 
1997). Only a few of these works consider the issue of automatic generation of control for 
improving the efficiency of logic programs. Finding a good ordering that leads to efficient 
execution requires a deep understanding of the logic inference mechanism. Hence, in many 
cases, only expert programmers are able to generate efficient programs. The problem inten- 
sifies with the recent development of the field of inductive logic programming (Muggleton 
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& De Raedt, 1994). There, logic programs are automatically induced by learning. Such 
learning algorithms are commonly built with the aim of speeding up the induction process 
without considering the efficiency of resulting programs. 

The goal of the research described in this paper is to design algorithms that automat- 
ically find efficient orderings of subgoal sequences. Several researchers have explored the 
problem of automatic reordering of subgoals in logic programs (Warren, 1981; Naish, 1985b; 
Smith & Genesereth, 1985; Natarajan, 1987; Markovitch & Scott, 1989). The general sub- 
goal ordering problem is known to be NP-hard (Ullman, 1982; Ullman & Vardi, 1988). 
Smith and Genesereth (1985) and Markovitch and Scott (1989) present search algorithms 
for finding optimal orderings. These algorithms are general and carry exponential costs for 
non-trivial sets of subgoals. Natarajan (1987) describes an efficient algorithm for the special 
case where subgoals in the set do not share free variables. 

In this paper we present a novel algorithm for subgoal ordering. We call two subgoals 
that share a free variable dependent. Unlike Natarajan's approach, which can only handle 
subgoal sets that are completely independent, our algorithm can deal with any subgoal 
set, while making maximal use of the existing dependencies for acceleration of the ordering 
process. In the worst case the algorithm - like that of Smith and Genesereth - is exponential. 
Still, in most practical cases, our algorithm exploits subgoal dependencies and finds optimal 
orderings in polynomial time. 

We start with an analysis of the ordering problem and demonstrate its importance 
through examples. We then show how to compute the cost of a given ordering based on 
the cost and the number of solutions of the individual subgoals. We describe the algorithm 
of Natarajan and the algorithm of Smith and Genesereth and show how the two can be 
combined into an algorithm that is more efficient and general than each of the two. We 
show drawbacks of the combined algorithm and introduce the new algorithm, which avoids 
these drawbacks. We call it the Divide-and-Conquer algorithm (T)AC algorithm). We prove 
the correctness of the algorithm, discuss its complexity and compare it to the combined 
algorithm. The DAC algorithm assumes knowledge of the cost and the number of solutions 
of the subgoals. This knowledge can be obtained by machine learning techniques such as 
those employed by Markovitch and Scott (1989). Finally, we test the utility of our algorithm 
by running a set of experiments on artificial and real domains. 

The DAC algorithm for subgoal ordering can be combined with many existing methods 
in logic programming, such as program transformation, compilation, termination control, 
correctness verification, and others. We discuss the possibilities of such combinations in the 
concluding section. 

Section 2 states the ordering problem. Section 3 describes existing ordering algorithms 
and their combination. Section 4 presents the new algorithm. Section 5 discusses the 
acquisition of the control knowledge. Section 6 contains experimental results. Section 7 
contains a discussion of practical issues, comparison with other works and conclusions. 

2. Background: Automatic Ordering of Subgoals 

We start by describing the conventions and assumptions accepted in this paper. Then we 
demonstrate the importance of subgoal ordering and discuss its validity. Finally, we present 
a classification of ordering methods and discuss related work. 
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2.1 Conventions and Assumptions 

All constant, function and predicate symbols in programs begin with lower case letters, 
while capital letters are reserved for variables. Braces are used to denote unordered sets 
(e.g., {a,b,c}), and angle brackets are used for ordered sequences (e.g., {a,b,c)). Parallel 
lines (II) denote concatenations of ordered sequences of subgoals. When speaking about 
abstract subgoals (and not named predicates of concrete programs), we denote separate 
subgoals by capital letters {A, B . . .), ordered sequences of subgoals by capitalized vectors 
{B, Os ■ ■ ■), and sets of subgoals by calligraphic capitals {B, S . . .). 7r{S) denotes the set of 
all permutations of S. 

We assume that the programs we work with are written in pure Prolog, i.e., without cut 
operators, meta-logical or extra-logical predicates. Alternatively, we can assume that only 
pure Prolog sub-sequences of subgoals are subject to ordering. For example, given a rule of 
the form 

A ^ Bi, B2, B3, !, B4, B5, Bq. 

only its final part {B4^, B^, Bq} can be ordered (without affecting the solution set). 

In this work we focus upon the task of finding all the solutions to a set of subgoals. 

2.2 Ordering of Subgoals in Logic Programs 

A logic program is a set of clauses: 

A^ Bi,B2,...,Bn. (n>0) 

where A,Bi,...,Bn are literals (predicates with arguments). To use such a clause for 
proving a goal that matches A, we must prove that all B-s hold simultaneously, under 
consistent bindings of the free variables. A solution is such a set of variable bindings. The 
solution set of a goal is the bag of all its solutions created by its program. 

A computation rule defines which subgoal will be proved next. In Prolog, the compu- 
tation rule always selects the leftmost subgoal in a goal. If a subgoal fails, backtracking is 
performed - the proof of the previous subgoal is re-entered to generate another solution. 
For a detailed definition of the logic inference process, see Lloyd (1987). 

Theorem 1 The solution set of a set of subgoals does not depend on the order of their 
execution. 

Proof: When we are looking for all solutions, the solution set does not depend on the 
computation rule chosen (Theorems 9.2 and 10.3 in Lloyd, 1987). Since a transposition of 
subgoals in an ordered sequence can be regarded as a change of the computation rule (the 
subgoals are selected in different order), such transposition does not change the solution 
set. □ 

This theorem implies that we may reorder subgoals during the proof derivation. Yet the 
efficiency of the derivation strongly depends on the chosen order of subgoals. The following 
example illustrates how two different orders can lead to a large difference in execution 
efficiency. 
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parent (abraham, isaac) . 


male (abraham) . 


parent (sarah, isaac) . 


male(isaac) . 


parent (abraham, ishmael) . 


male(ishmael) . 


parent (isaac , esav) . 


male (jakov) . 


parent ( isaac, jakov) . 


male(esav) . 


... More parent clauses ... 


... More male clauses ... 


brother(X,Y) ^male(X), parent(W,X), parent(W,Y), X=/=Y. 


father(X,Y) ^male(X), parent(X,Y). 


uncle(X,Y) ^ parent (Z ,Y) , brother(X,Z) . 


... More rules of relations ... 





Figure 1: A small fragment of a Biblical database describing family relationships. 



Example 1 

Consider a Biblical family database such as the one listed in Figure 1 (a similar database 
appears in the book by Sterling & Shapiro, 1994). The body of the rule defining the 
uncle-nephew (or uncle-niece) relation can be ordered in two ways: 

1. uncle(X,Y) ^ brother(X,Z) , parent(Z,Y). 

2. uncle(X,Y) ^ parent (Z ,Y) , brother(X,Z) . 

To prove the goal uncle ( ishmael ,Y) using the first version of the rule, the interpreter will 
first look for Ishmael's siblings (and find Isaac) and then for the siblings' children (Esav 
and Jacov). The left part of Figure 2 shows the associated proof tree with a total of 10 
nodes. If we use the second version of the rule, the interpreter will create all the parent- 
child pairs available in the database, and will test for each parent whether he (or she) is 
Ishmael's sibling. The right part of Figure 2 shows the associated proof tree with a total 
of 4(A^ — 2)-|-6-2-|-2 = 4N -|- 6 nodes, where N is the number of parent-child pairs in the 
database. The tree contains two success branches and N — 2 failure branches; in the figure 
we show one example of each. While the two versions of the rule yield identical solution 
sets, the first version leads to a much smaller tree and to a faster execution. 

Note that this result is true only for the given mode (bound, free) of the head literal; 
for the mode (free, bound), as in uncle(X, jacov) , the outcome is the contrary: the second 
version of the rule yields a smaller tree. 

2.3 Categories of Subgoal Ordering Methods 

Assume that the current conjunctive goal (the current resolvent) is {/li,/l2}. Assume that 
we use the rule "Ai <— Aii,Ai2" to reduce Ai. According to Theorem 1, the produced 
resolvent, {An, A12, A2}, can be executed in any order. We call ordering methods that 
allow any permutation of the resolvent interleaving ordering methods, since they permit 
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uncle(X,Y) <- brother(X,Z) , parent(Z,Y). uncle (X, Y) <- parent (Z,Y) , brother(X,Z) . 



^^^^^^^^^^^^ 

brother(ishmaci,Z). parcnt(Z,Y) 



male(ishmael), parent(W,ishmael), parent(W,Z), 
ishmael =/= Z, parent(Z,Y) 



parent(W,ishmael), parent(W,Z), 
ishmael =/= Z, parent(Z,Y) 

W=abraham 



I parent(abraham,Z), ishmael=:/=Z, parent(Z,Y) | 
Z=ishraael/ N. Z=isaac 



ishmael =/= ishmael, 
parent(ishmLael,Y) 



X 



^^^^^^^^^^^^ 

|parent(Z,Y), brother(ishmaeI,Z) | 




male{ishmael), parent(W,ishmael), 
parent(W,adam), ishmael =/= adam 

parent(W,ishmael), parent(W,adam), 
ishmael =/= adam 



other 
parent-child 
pairs 



Z=isaac, 

- j acov 

I brothei'(ishmael,isaac)| 



male(ishmael), parent(W,ishmael), 
parent(W,isaac), ishmael=/=isaac 

parent(Wjishmael), parent(W,isaac), 
ishmael =/= isaac 



isaac =/= ishmael, 
parent(isaac,Y) 



[parent(isaac,Yy] 
Y=esav \y= jacov 



^ W=abraham 
I parent(abraham,adam), ishmael =/=admn] 



l|f W=abraham 
|parent(abraham,isaac), ishmael =/=isaac | 



I ishmael =/= isaac I 



□ 



□ 



□ 



Figure 2: Two proof trees obtained witli different orderings of a single rule in Example 1. 



interleaving of subgoals from different rule bodies. When ordering is performed only on 
rule bodies before using them for reduction, the method is non-interleaving. In the above 
example, interleaving methods will consider all 6 permutations of the resolvent, while non- 
interleaving methods will consider only two orderings: {An, A12, A2) and {A12, An, A2). 
Interleaving ordering methods deal with significantly more possible orderings than non- 
interleaving methods. That means that they can find more efficient orderings. On the 
other hand, the space of possible orderings may become prohibitively large, requiring too 
many computational resources. 

Subgoal ordering can take place at various stages of the proof process. We divide all 
subgoal ordering methods into static, semi-dynamic and dynamic. 

• Static ordering: The rule bodies are ordered before the execution starts. No order- 
ing takes place during the execution. 

• Semi-dynamic ordering: Whenever a rule is selected for reduction, its body is 
ordered. The order of its subgoals does not change after the reduction takes place. 

• Dynamic ordering: The ordering decision is made at each inference step. 

Static methods add no overhead to the execution time. However, the optimal ordering 
of a rule often depends on a particular binding of a variable, which can be known only at 
run-time. For instance, in Example 1 we saw that the first ordering of the rule is better 
for proving the goal uncle ( ishmael ,Y) . And yet, for the goal uncle(X, jacov) , it is the 
second ordering that yields more efficient execution. To handle such cases statically, we 
must compute the optimal ordering for each possible binding. 
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Obviously, static ordering can only be non-interleaving. The dynamic method is more 
flexible, since it can use more updated knowledge about variable bindings, but it also carries 
the largest runtime overhead, since it is invoked several times for each use of a rule body. 
The semi-dynamic method is a compromise between the two: it is more powerful than the 
static method, because it can dynamically propose different orderings for different instances 
of the same rule; it also carries less overhead than the dynamic method, because it is invoked 
only once for each use of a rule body. 

The total time of proving a goal is the sum of the ordering time and the inference time. 
Interleaving and dynamic methods have the best potential for reducing the inference time, 
but may signiflcantly augment the ordering time. Static methods do not devote time to 
ordering (it is done off-line), but have a limited potential for reducing the inference time. 

The algorithms described in this paper can be used for all categories of ordering methods, 
although in the experiments described in Section 6 we have only implemented semi-dynamic, 
non-interleaving ordering methods: on each reduction, the rule body is ordered and added 
to the left end of the resolvent, and then the leftmost literal of the resolvent is selected for 
the next reduction step. 

2.4 Related Work 

The problem of computational inefficiency of logic inference was the subject of extensive 
research. The most obvious aspect of this inefficiency is the possible non-termination of 
a proof. Several researchers developed compile-time and run-time techniques to detect 
and avoid inflnite computations (De Schreye & Decorte, 1994). A certain success was 
achieved in providing more advanced control through employment of co-routining for inter- 
predicate synchronization purposes (Clark & McCabe, 1979; Porto, 1984; Naish, 1984). 
Also, inflnite computations can be avoided by pruning inflnite branches that do not contain 
solutions (Vasak & Potter, 1985; Smith, Genesereth, & Ginsberg, 1986; Bol, Apt, & Klop, 
1991). In the NAIL! system (Morris, 1988) subgoals are automatically reordered to avoid 
nontermination. 

Still, even when the proof is flnite, it is desirable to make it more efficient. Several 
researchers studied the problem of clause ordering (Smith, 1989; Cohen, 1990; Etzioni, 
1991; Laird, 1992; Mooney & Zelle, 1993; Greiner & Orponen, 1996). If we are looking for 
all the solutions of a goal, then the efficiency does not depend on the clause order (assuming 
no cuts). Indeed, if some predicate has m clauses, and for some argument bindings these 
clauses produce all their solutions in times ti,t2 ■ ■ - tm, then all solutions of the predicate 
under these bindings are obtained in time ti -\- 12 + ■ ■ ■ + 1^, regardless of the order in 
which the clauses are applied. Different clause orderings correspond to different orders in 
which branches are selected in a proof tree; if we traverse the entire tree, then the number 
of traversal steps does not depend on the order of branch selection, though the order of 
solutions found does depend on it. 

Subgoal ordering, as was demonstrated in Example 1, can signiflcantly affect the effi- 
ciency of proving a goal. There are two major approaches to subgoal ordering. The first 
approach uses various heuristics to order subgoals, for example: 

• Choose a subgoal whose predicate has the smallest number of matching clauses (Minker, 
1978). 
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• Prefer a subgoal with more constants (Minker, 1978). 

• Choose a subgoal with the largest size, where the size is defined as the number of 
occurrences of predicate symbols, function symbols, and variables (Nie & Plaisted, 
1990). 

• Choose a subgoal with the largest mass, where the mass of a subgoal depends on 
the frequency of its arguments and sub-arguments in the entire goal (Nie & Plaisted, 
1990). 

• Choose a subgoal with the least number of solutions (Warren, 1981; Nie & Plaisted, 
1990). 

• Apply "tests" before "generators" (Naish, 1985a). 

• Prefer calls that fail quickly (Naish, 1985b). 

The heuristic methods usually execute quickly, but may yield suboptimal orderings. 

The second approach, which is adopted in this paper, aims at finding optimal order- 
ings (Smith & Genesereth, 1985; Natarajan, 1987; Markovitch & Scott, 1989). Natarajan 
proposed an efficient way to order a special sort of subgoal set (where all subgoals are in- 
dependent), while Smith and Genesereth proposed a general, but inefficient algorithm. In 
the following section we build a unifying framework for dealing with subgoal ordering and 
describe variations on Natarajan's and Smith and Genesereth's algorithms. We also show 
how the two can be combined for increased efficiency. 

3. Algorithms for Subgoal Ordering in Logic Programs 

The goal of the work presented here is to order subgoals for speeding up logic programs. This 
section starts with an analysis of the cost of executing a sequence of subgoals. The resulting 
formula is the basis for the subsequent ordering algorithms. Then we discuss dependence 
of subgoals and present existing ordering algorithms for independent and dependent sets of 
subgoals. Finally, we combine these algorithms into a more general and efficient one. 

3.1 The Cost of Executing a Sequence of Subgoals 

In this subsection we analyze the cost of executing a sequence of subgoals. The analysis 
builds mainly on the work of Smith and Genesereth (1985). 

Let S = {Ai, A2, ■ ■ -Ak} be a set of subgoals and 5 be a binding. We denote Sols{S) to 
be the solution set of S, and define Sols{9) = {0}. We denote Ai\b to be Ai whose variables 
are bound according to b (Aj|0 = Ai). Finally, we denote Cost{Ai\b) to be the amount of 
resources needed for proving Ai\b. Cost{Ai\b) should reflect the time complexity of proving 
Ai under binding b. For example, the number of uniflcation steps is a natural measure of 
complexity for logic programs (Itai & Makowsky, 1987). 

To obtain the cost of flnding all the solutions of an ordered sequence of subgoals 

S = {Ai,A2,As,...An), (1) 
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we note that the proof-tree of Ai is traversed only once, the tree of A2 is traversed once 
for each solution generated by Ai, the tree of A3 - once for each solution of {Ai, A2}, etc. 
Consequently, the total cost of proving Equation 1 is 

Cost{{Ai,...A.,)) = Cost{Ai)+ J2 Cost{A2\b) + ...+ ^ Cosi(A„|6) = 

beSols({Ai}) beSols({Ai,...A„-i}) 

n 

= E E Cost{A,\,). (2) 

i=l beSols({Ai,...A,_i}) 

To compute Equation 2 one must know the cost and the solution set for each subgoal 
under each binding. To reduce the amount of information needed, we derive an equivalent 
formula, which uses average cost and average number of solutions. 

Definition: Let ^ be a set of subgoals, A a subgoal. Define cost{A)\]3 to be the average 
cost of A over all solutions of B and nsols{A)\]3 to be its average number of solutions over 
all solutions of B: 

i Cost{A), B = $ 

cost{A)\s= \ ^ / 0, 5o/.(^) / 

y undefined, S / 0, Sols{B) = 



r \Sols{{A})\, B = 9 

nsoHA)\s=l ^^^^"' l i^',!^;';"^'^"' , ^ / 0, 5o/.(S) / 
[ undefined, B^<1>, Sols{B) = 



From the first definition, it follows that: 

^ Cost{A,\b) = \Sols{{Ai, . . . A_i})| X c-o-st{Ai)\{A,,...A,_,}- (3) 

heSols({Ai,...A,_i}) 

If we apply the second definition recursively, we obtain 

\Sols{{Ar,...A,])\ = \Sols{{A,\,])\ 

heSols({Ai,...A,_i}) 

= \Sols{{Ai, . . . A_i})| X nsdls{Ai)\{A^^^^^A^_^-^ 

i 

= ... = Y{n-sols{A,)\{A,...A,.,}- (4) 

Note that we defined Sols{9) = {0}; thus, these equations hold also for i = 1. Incorporation 
of Equations 3 and 4 into Equation 2 yields 

n 

Cost{{Ai,A2,...A^)) = J2 

8 = 1 



n n^0ls{Aj)\^Ar. 



(5) 
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For each subgoal Ai, its average cost is multiplied by the total number of solutions of 
all the preceding subgoals. We can define average cost and number of solutions for every 
continuous sub-sequence of Equation 1: Vfci, k2, 1 < ki < k2 < n, 

. \^| cost{{Ai, . ..Ak,))U - cost{{Ai, . ..Ak,-i))U 
co.st{{A,,,...A,,))\^^^„„^^^_^y = n-oH{A,,...A,,_,))U 



k2 

E 

i=ki 



"^"«^'.--^'J)i<- = ;Si|te^ ^n;^'»(-^di<. (^) 

The values of cIosi{Ai) and nsdls{Ai) depend on the position of Ai in the ordered se- 
quence. For example, assume that we want to find Abraham's sons, using the domain of 
Example 1. The unordered conjunctive goal is {male(Y) ,parent(abraham,Y)}. Let there 
be N males in the database (two of them, Isaac and Ishmael, are Abraham's sons): 

nso/s(male(Y))|0 = N nso/s(parent (abraham, Y)) I0 = 2 

nso/s(male(Y))|{p^^gj,t(abraham,Y)} = 1 nso/s(parent(abraham,Y))|{„^lg(Y)} = 2/7V 

Note that nso/s( (male(Y) , parent (abraham,Y))) = 2 = nso/s((parent (abraham, Y) ,male(Y))), 

exactly as Theorem 1 predicts. 

Having defined the cost of a sequence of subgoals, we can now define the objective of 
our ordering algorithms: 

Definition: Let 5 be a set of subgoals. Define 7r((5) to be set of all permutations of 
S. Os G T^iS) is a minimal ordering of S (denoted Min{Os, S)) , if its cost according to 
Equation 5 is minimal over all possible permutations of S: 

Min{ds,S) ^ yO's e Tr{S) : Cost{ds) < Cost{0's). 



The total execution time is the sum of the time which is spent on ordering, and the 
inference time spent by the interpreter on the ordered sequence. In this paper we focus 
upon developing algorithms for minimizing the inference time. Elsewhere (Ledeniov & 
Markovitch, 1998a, 1998b) we present algorithms that attempt to reduce the total execution 
time. 

The values of cost and number of solutions can be obtained in various ways: by exact 
computation, by estimation and bounds, and by learning. Let us assume at the moment 
that there exists a mechanism that returns the average cost and number of solutions of a 
subgoal in time r. In Section 5 we show how this control knowledge can be obtained by 
inductive learning. 

3.2 Ordering of Independent Sets of Subgoals 

The general subgoal ordering problem is NP-hard (Ullman & Vardi, 1988). However, there 
is a special case where ordering can be performed efficiently: if all the subgoals in the 
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given set are independent, i.e. do not share free variables. This section begins with the 
definition of subgoal dependence and related concepts. We then show an ordering algorithm 
for independent sets and prove its correctness. 

3.2.1 Dependence of Subgoals 

Definition: Let S and B be sets of subgoals {B is called the binding set of S). A pair of 
subgoals in S is directly dependent under B, if they share a free variable not bound by a 
subgoal of B. 

A pair of subgoals is indirectly dependent with respect to S and B if there exists a third 
subgoal in S which is directly dependent on one of them under B, and dependent (directly 
or indirectly) on the other one under B. A pair of subgoals of S is independent under B if 
it is not dependent under B (either directly or indirectly). A subgoal is independent of S 
under B if it is independent of all members of S under B. 

Two subsets Si C S and S2 C S are mutually independent under the binding set B if 
every pair of subgoals {Ai, A2), such that Ai G Si and A2 £ S2, is independent under B. 

The entire set S is called independent under the binding set B if all its subgoal pairs 
are independent under B, and is called dependent otherwise. A dependent set of subgoals 
is called indivisible if all its subgoal pairs are dependent under B, and divisible otherwise. 

A divisibility partition of S under B, DPart{S , B) , is a partition of S into subsets that 
are mutually independent and indivisible under B, except at most one subset which contains 
all the subgoals independent of S under B. It is easy to show that DPart{S , B) is unique. 

For example, let So = {a,b{X) ,c{Y) , d{X ,Y) , e{Z) , f{Z,V) , h{W)}. With respect to 
So and an empty binding set, the pair {b{X) , d{X ,Y)} is directly dependent, {b{X) , c{Y)} 
is indirectly dependent and {b{X) , e{Z)} is independent. If we represent a set of subgoals 
as a graph, where subgoals are vertices and directly dependent subgoals are connected by 
edges, then dependence is equivalent to connectivity and indivisible subsets are equivalent 
to connected components of size greater than 1. The divisibility partition is the partition 
of a graph into connected components, with all the "lonely" vertices collected together, in 
a special component. Figure 3 shows an example of such a graph for the set So and for 
an empty binding set. The whole set is divisible into four mutually independent subsets. 
The subsets {e{Z) , f{Z,V)} and {b{X) , c{Y) , d{X ,Y)} are indivisible. Elements of the 
divisibility partition DPart{So,9) are shown by dotted lines. 

If a subgoal is independent of the set, then its average cost and number of solutions do 
not depend on its position within the ordered sequence: 

In this case we can omit the binding information and write cUst^Ai) instead of cost{Ai)\^j^^ j^^_^-j, 
and n'sdls{Ai) instead of n'sdls{Ai)\^j^^ j^^_^-y 

In practice, program rule bodies rarely feature independent sets of literals. An example 
is the following clause, which states that children like candy: 
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{a, b(X), c(Y), d(X,Y), e(Z), f(Z,V), h(W)} 



e(Z) \-.. --^ \\ // ': 



{ h(W) ;, 



Figure 3: An example of a graph representing a set of subgoals. Directly dependent subgoals 
are connected by edges. Independent subgoals and indivisible subsets are equivalent to 
connected components (surrounded by dashed lines). The divisibility partition (under 
the empty binding set) is shown by dotted lines. 



likes(X,Y) ^ child(X) , candy (Y) . 

More often, independent rule bodies appear not because they are written as such in the 
program text, but because some variables are bound in (initially dependent) rule bodies, as 
a result of clause head unification. For example, if the rule 

father(X,Y) ^male(X), parent(X,Y). 

is used to reduce f ather (abraham,W) , then X is bound to abraham, and the rule body 
becomes independent. Rule bodies often become independent after substitutions are per- 
formed in the course of the inference process. 

3.2.2 Algorithm for Ordering Independent Sets by Sorting 

Let S be an ordered sub-sequence of subgoals, B a set of subgoals. We denote 

- n-sols{S)\B - I 
(^HS)\b = — ^- . 

COSt(S)\B 

The name "era" reflects the participation of cUst and nsols in the definition. When the sub- 
sequence S is independent of other subgoals, the binding information (|b) can be omitted. 
Together, the average cost, average number of solutions, and era value of a subgoal will be 
called the control values of this subgoal. 

For independent sets, there exists an efficient ordering algorithm, listed in Figure 4. The 
complexity of this algorithm is 0(ra(r -|- logra)): 0{n ■ r) to obtain the control values of ra 
subgoals, and 0(ra logra) to perform the sorting (Knuth, 1973). To enable the division, we 
must define the cost so that c7)st{Ai) is always positive. If we define the cost as the number 
of unifications performed, then always cUst^Ai) > 1, under a reasonable assumption that 
predicates of all rule body subgoals are defined in the program. (In this case, at least one 
unification is performed for each subgoal). Similar algorithms were proposed by Simon and 
Kadane (1975) and Natarajan (1987). 

Example 2 Let the set of independent subgoals be {p, q, r}, with the following control val- 
ues: 
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Algorithm 1 




LetS = {Ai,A2,.. 


An} be a set of suhgoals. 


Sort S using cn{Ai] 


= as the key for Ai, and return the result. 



Figure 4: The algorithm for ordering subgoals by sorting. 





p q r 


cost 

nsols 

cn 


10 20 5 
1 5 0.1 
0.2 -0.18 



We compute the costs of all possible orderings, using Equation 5: 



Co.st{{p, q, r) 
Co.st{{p, r, q) 
Co.st{{q, p, r) 
Co.st{{q, r, p) 
Co.st{{r, p, q) 
Co.st{{r, q, p) 



10+ 1-20 + 1-5 -5 = 55 
10+ 1-5 + 1-0.1 -20 = 17 
20 + 5 -10 + 5 •1-5 = 95 
20 + 5 -5 + 5 -0.1 -10 = 50 
5 + 0.1 -10 + 0.1 •1-20 = 8 
5 + 0.1 -20 + 0.1 -5 -10 = 12 



The minimal ordering is {r,p,q), and this is exactly the ordering which is found much 
more quickly by Algorithm 1 for the set {p, g,r}; r has the smallest cn value, —0.18, then 
goes p with cn{p) = 0, and finally q with cn{q) = 0.2. 

Note that the sorting algorithm reflects a well-known principle: The best implementa- 
tions of generate-and-test programs are obtained with the tests placed as early as possible 
in the rule body and the generations as late as possible (Naish, 1985a). Of course, the 
cheap tests should come flrst, while the expensive ones should come last. If one looks at 
the cn measure, one quickly realizes that tests should be put in front (because n'sols < 1, 
so era < 0), while generator subgoals should move towards the end {n'sols > 1, so era > 0). 
The weakness of the "test-flrst" principle is in the fact that not every subgoal can be easily 
tagged as a test or a generator. If one subgoal has n'sols < 1 and another one has n'sols > 1, 
then their order is obvious even without looking at the costs (because their era values have 
different signs). But if both subgoals have n'sols < 1, or both have n'sols > 1, then the 
decision is not so simple. Sorting by era can correctly handle all the possible cases. 

3.2.3 Correctness Proof of the Sorting Algorithm for Independent Sets 

We saw that Algorithm 1 found a minimal ordering in Example 2. We are now going to 
prove that Algorithm 1 always flnds a minimal ordering for independent sets. First we 
show an important lemma which will also be used in further discussion. This lemma states 
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that substitution of a sub-sequence by its cheaper permutation makes the entire sequence 
cheaper. 

Lemma 1 

Let S = A\\B\\C , S' = A\\B'\\C, where B and B' are permutations of one another, and A 
either is empty or has nsols(/l) > 0. Then 

Cost{S) < Cost{S') ^ cost (5)1^- < cost (5') I/, 
Cost{S) = Cost{S') ^ cost (5)1^- = cost (5') 1 1- 
Proof: If A and C are not empty, 

Cost{S) - Cost{S') = Cost{A\\B\\C) - Cost{A\\B'\\C) = 

® (^cost{A)\g, + nsdls{A)\g, x cost{B)\j+ nsdls{A\\B)\g, x cost{C)\j^^g^ - 
(^cosi(A)|0 + nso/s(A)|0 X cosi(5')U+ nso/s(A||5') I0 x cosi( (7) |^||^,) . 

By Theorem 1, B and B' produce the same solution sets. Hence, the third terms in the 
parentheses above are equal, and 

Cost{S) - Cost{S') = n-sols{A)\^ x {cost{B)\^ - rSost{B')\^ . 

Since risdls{A) > 0, the sign of Cost{S) — Cost(S') coincides with the sign of cUst^B)]^ — 
c-o7st{B')\^. 

If A or C is empty, the proof is similar. □ 

Definition: Let S = A||_Bi||C'||_B2||-D be an ordered sequence of subgoals (A, C and D may 
be empty sequences). With respect to S, the pair {Bi,B2) is 

• cn-ordered, if cn{Bi)\^ < cra(_B2) l/y^^ycf 

• cn-inverted, if cn{Bi)\^ > cn{B2)\ ^uBiLsC 

We now show that two adjacent mutually independent sequences of subgoals in a minimal 
ordering must be cn-ordered. 

Lemma 2 

Let S = A||_Bi||_B2||C'; S' = A||_B2||-Bi||C, where Bi, B2 are mutually independent under A. 
Let A either be empty or have nsols(/l) > 0. Then 

Cost{S) < Cost{S') ^ cn{Bi)\^ < cn{B2)\^, 
Cost{S) =Cost{S') ^ cn{Bi)\^= cn{B2)\^. 
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Proof: 

Cost{S) < Cost{S') 



Lemma 1 



cost{Bi\\B2)\^< cost{B2\\Bi)\^ 
cost{Bi)\^+ nsols{Bi)\^X cost{B2)\ 
cost{B2)\^+ nsdls{B2)\^X cost{Bi)\ 



AuBi 
AUB2 



< 



indep.{_Bi, 



Cost{S) = Cost{S') 



cost{Bi)\^+ nsdls{Bi)\^X cost{B2)\ 
cost{B2)\^+ nsols{B2)\^X cost{Bi)\ 
nsols[Bi)\^ X cost[B2)\^ — cost[B2, 
nsols[B2)\^ X cost[Bi)\^ — cost[Bi 



< 



< 



cost(B,)\j^>o nsols{Bi)\ 



1 nsols{B2) 1/ — 1 



cost{Bi)\^ cost{B2)\x 
cn{Br)\-^<cn{B2)\x 



cn{B,)\^=cn{B2)\x 



similar. 



□ 



In an independent set, all subgoal pairs are independent, in particular all adjacent pairs. 
So, in a minimal ordering of an independent set, all adjacent subgoal pairs must be cn- 
ordered; otherwise, the cost of the sequence can be reduced by a transposition of such pair. 
This conclusion is expressed in the following theorem. 

Theorem 2 

Let S be an independent set. Let S be an ordering of S. S is minimal iff all the subgoals in 
S are sorted in non-decreasing order by their an values. 

Proof: 

1. Let S* be a minimal ordering of S. If S* contains a cn-inverted adjacent pair of subgoals, 
then transposition of this pair reduces the cost of S (Lemma 2), contradicting the 
minimality of S. 

2. Let S be some ordering of S, whose subgoals are sorted in non-decreasing order by 
an. Let S' be a minimal ordering of S. According to item 1, S' is also sorted by 
an. The only possible difference between the two sequences is the internal ordering 
of sub-sequences with equal an values. The ordering of each such sub-sequence in 
S can be transformed to the ordering of its counterpart sub-sequence in S' by a 
finite number of transpositions of adjacent subgoals. By Lemma 2, transpositions of 
adjacent independent subgoals with equal an values cannot change the cost of the 
sequence. Therefore, Cost{S) = Cost{S'), and S* is a minimal ordering of S (since S' 
is minimal). □ 



Corollary 1 Algorithm 1 finds a minimal ordering of an independent set of subgoals. 
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3.3 Ordering of Dependent Sets of Subgoals 

Algorithm 1 does not guarantee finding a minimal ordering when the given set of subgoals 
is dependent, as the following proposition shows. 

Proposition 1 When the given set of subgoals is dependent, then: 

1. The result of Algorithm 1 on it is not always defined. 

2. Even when the result is defined, it is not always a minimal ordering of the set. 

Proof: Both claims are proved by counter-examples. 

1. We show a set of subgoals that cannot be ordered by sorting. 



The program: 


Contro 


values: 
a{X% 


a{X)\s^h(x)} 




Kx)\uix)} 


a(cl). b(cl). 


cost 


2 


2 


2 


2 


a(c2). b(c2). 


n'sols 


2 


1 


2 


1 




cn 


1 

2 





1 

2 






Contro 


values: 
a{X)U 


«(^)I{6(X)} 




Kx)\u(x)} 


cUst 


2 


2 


8 


2 


n'sols 


2 


2 


1 


1 


cn 


1 

2 


1 

2 









The set {a(X) , b(X)} has two possible orderings, {a{X) ,b{X)) and {b{X),a{X)). 
Both orderings have minimal cost, though neither one is sorted by era: each ordering 
has era = ^ for its first subgoal, and era = for the second one. Sorting by era is 
impossible here: when we transpose subgoals, their era values are changed, and the 
pair becomes cn-inverted again. 

2. We show a set of subgoals that can be ordered by sorting, but its sorted ordering is 
not minimal. 

The program: 
a(cl) . 
a(cl) . 
b(cl) . 

b(c2) ^ a(cl) , a(c2) . 

Let the unordered set of subgoals be {a(X) , b(X)}. Its ordering {b{X), a{X)) is sorted 
by era, while (a(X), b{X)) is not. But (a(X), b{X)) is cheaper than {b{X), a{X)): 

cost{{a{X),b{X))) = 2 + 2-2 = 6 cost{{b{X) , a{X))) = 8 + 1 • 2 = 10 

□ 

Since sorting cannot guarantee minimal ordering for dependent subgoals, we now con- 
sider alternative ordering algorithms. The simplest algorithm checks every possible permu- 
tation of the set and returns the one with the minimal cost. The listing for this algorithm 
is shown in Figure 5. 

This algorithm runs in 0(r -ra!) time, where r is the time it takes to compute the control 
values for one subgoal, and ra is the number of subgoals. 

The following observation can help to reduce the ordering time at the expense of addi- 
tional space. Ordered sequences can be constructed incrementally, by adding subgoals to 
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Algorithm 2 

For each permutation of suhgoals, find its cost according to Equation 5. 
Store the currently cheapest permutation and update it when a cheaper 
one is found. 

Finally, return the cheapest permutation. 



Figure 5: The algorithm for subgoal ordering by an exhaustive check of all permutations. 



Algorithm 3 






Order((5) 






let Vo ^ {0}, n 






loop for A; = 1 to 


n 




n ^ {p\\B 


PeVk-i, BeS\P} 


Vk^{PeV 


k 


VP' G V'f^, \permutation{P, P') => Cost{P) < Cost{P')] j 


Return the single member ofVn- 



Figure 6: The ordering algorithm which checks permutations of ordered prefixes. 



the right ends of ordered prefixes. By Lemma 1, if a cheaper permutation of a prefix exists, 
then this prefix cannot belong to a minimal ordering. The ordering algorithm can build 
prefixes with increasing lengths, at each step adding to the right end of each prefix one of 
the subgoals that do not appear in it already, and for each subset keeping only its cheapest 
permutation (if several permutations have equal cost, any one of them can be chosen). The 
listing for this algorithm is shown in Figure 6. At each step k, V'j. stores the set of prefixes 
from step k — 1 extended by every subgoal not appearing there already. Vk C V^, and 
in Vk each subset of subgoals is represented only by its cheapest permutation. Obviously, 
\'Pk\ = i'l) (one prefix is kept for every subset of S of size k). For each prefix of length k — 1, 
there are n — {k — 1) possible continuations of length k. The size of is as follows: 

\K\ = = („_(,_"')),(,._ Df i'-i^-D) = r („_,;:(,_i), = 

For each prefix, we compute its cost in r time. The permutation test can be completed 
in 0{n) time, by using, for example, a trie structure (Aho et al., 1987), where subgoals in 
prefixes are sorted lexicographically. Each step k takes 0{{n -\- t) ■ k ■ Q)) time, and the 
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whole algorithm runs in 

n n 

Y^O{{n + T).k. il)) = 0(n • (n + r) • Y.{1)) = 0(n • (n + r) • 2"). 

k=l k=l 

If r = 0(ra), this makes 0(ra2 . 2"). 

Smith and Genesereth (1985) and Natarajan (1987) point out that in a minimal ordered 
sequence every adjacent pair of subgoals must satisfy an adjacency restriction. The most 
general form of such a restriction in our notation says that two adjacent subgoals and 
in a minimal ordering {Ai, A2 ■ ■ - An) must satisfy 

C-OSt{{Ak, Ak+l))\{Ai...Ak-i} < COSt{{Ak+l, Ak))\{Ai...Ak-i}- (8) 

The restriction follows immediately from Lemma 1. However, it can only help to find a 
locally minimal ordering, i.e., an ordering that cannot be improved by transpositions of 
adjacent subgoals. It is possible that all adjacent subgoal pairs satisfy Equation 8, but the 
ordering is still not minimal. The following example illustrates this statement. 

Example 3 Let the unordered set be {p{X),q{X),r{X)}, where the predicates are defined 
by the following program: 

p{ci). q{ci). r(ci). 

P(c2) ^ /• g(c2). r(ci). 
g(c3) ^ /• 

/ <— fails after 50 unifications. 

The ordering {p{X) , q{X) , r{X)) satisfies the adjacency restriction (Equation 8): 

cost(p(X),g(X))|0 = 55 cost(g(X), r(X))|p(x) = 5 

cost(g(X),p(X))|0= 107 cost(r(X),g(X))|p(x) = 8 

But it is not minimal: 

cost((p(X),g(X),r(X))) = 57 
cost((r(X),p(X),g(X))) = 12 

To find a globally minimal ordering, it seems beneficial to combine the prefix algorithm 
with the adjacency restriction: if a prefix does not satisfy the adjacency restriction, then 
there is a cheaper permutation of this prefix. The adjacency test can be performed faster 
than the permutation test, since it must only consider the two last subgoals of each pre- 
fix. Nevertheless, the number of prefixes remaining after each step of Algorithm 3 is not 
reduced: if a prefix is rejected due to a violation of the adjacency restriction, it would have 
also been rejected by the permutation test. Furthermore, if the adjacency restriction test 
does not fail, we should still perform the permutation test to avoid local minima (as in 
Example 3). The adjacency test succeeds in at least half of the cases: if we examine a 
prefix (Ai, . . .Afc, _Bi, _B2)j we shall also examine (Ai, . . .A^, _B2j -Bi), and the adjacency test 
cannot fail in both. Consequently, addition of the adjacency test can only halve the total 
running time of the ordering algorithm, leaving it 0{n^ ■ 2") in the worst case. 



53 



Ledeniov & Markovitch 



Smith and Genesereth propose performing a best-first search in the space of ordered 
prefixes, preferring prefixes with lower cost. The best-first search can be combined with 
the permutation test and the adjacency restriction. In addition, when the subgoals not 
in a prefix are independent under its binding, they can be sorted, and the sorted result 
concatenated to the prefix. By Lemma f and Corollary f, this produces the cheapest 
completion of this prefix. When we perform completion, there is no need to perform the 
adjacency or permutation test: if a complete sequence is not minimal, it will never be chosen 
as the cheapest prefix; even if it is added to the list of prefixes, it will never be extracted 
therefrom. The resulting algorithm is shown in Figure 7. 



Algorithm 4 
Order((5) 

let prefix-list <— 0, prefix <— 0, rest <— S 
loop until empty ( rest) 

if Independent (res^lp^g^jj,) 

then 

let completion <— prefix\\Sort-hy-cn{rest\pj.g-pr^) 
Insert-By-Cost (comp/e^jon, prefix-list) 
else 

loop for subgoal G rest 

let extension <— prefix\\subgoal 

if Adjacency- Restriction- Test (extension) 

and Permutation- Test (extension) 

then 

Insert-By-Cost (extension, prefix-list) 
prefix <— Cheapest (j9re_/jx-/is^) 
Remove-from-list {prefix, prefix-list) 
rest <— S\prefix 
Return prefix 



Figure 7: An algorithm for subgoal ordering, incorporating the ideas of earlier researchers. 

The advantage of using best-first search is that it avoids expanding prefixes whose cost 
is higher than the cost of the minimal ordering. The policy used by the algorithm may, 
however, be suboptimal or even harmful. It often happens that the best completion of a 
cheaper prefix is much more expensive than the best completion of a more expensive prefix. 
When the number of solutions is large, it is better to place subgoals with high costs closer 
to the beginning of the ordering to reduce the number of times that their cost is multiplied. 

For example, let the set be {a(X), 5(X)}, with cost{a{X)) = 10, c7)st{b{X)) = n'sdls{a{X)) 
= n'sdls{b{X)) = 2. Then a minimal ordering starts with the most expensive prefix: 

Cost{{a{X),b{X))) = 10 2 • 2 = 14 
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Cost{{b{X),a{X))) = 2 + 2 • 10 = 22 

If there are many prefixes whose cost is higher than the cost of the minimal ordering, then 
best-first search saves time. But if the number of such prefixes is small, using best-first 
search can increase the total time, due to the need to perform insertion of a prefix into a 
priority queue, according to its cost. 

A sample run of Algorithm 4 will be shown later (in Section 4.7). 



4. The Divide-and-Conquer Subgoal Ordering Algorithm 

Algorithm 1 presented in Section 3.2 is very efficient, but is applicable only when the entire 
set of subgoals is independent. Algorithm 3 can handle a dependent set of subgoals but is 
very inefficient. Algorithm 4, a combination of the two, can exploit independence of sub- 
goals for better efficiency. However, the obtained benefit is quite limited. In this section, 
we present the Divide-and-Conquer (dac) algorithm, which is able to exploit subgoal inde- 
pendence in a more elaborate way. The algorithm divides the set of subgoals into smaller 
subsets, orders these subsets recursively and combines the results. 

4.1 Divisibility Trees of Subgoal Sets 

In this subsection we define a structure that represents all the ways of breaking a subgoal 
set into independent parts. Our algorithm will work by traversing this structure. 

Definition: Let S and B be sets of subgoals. The divisibility tree of S under B, DTree{S, B), 
is an AND-OR tree defined as follows: 



DTree(S,B) 



leaf(i5, B) — i5 is independent under B 

OR{S, B, {DTree{S \ {Bi},B U {5,}) | Bi e 5}) - S is indivisible under B 
AND(5, B, {DTree{Si,B) \ Si e DPart{S, B)}) - S is divisible under B 



Each node N in the tree DTree{So, Bq) has an associated set of subgoals S{N) C Sq and 
an associated binding set B{N) D Bq. For the root node, S{N) = Sq, B{N) = Bq. If the 
binding set of the root is not specified explicitly, we assume it to be empty. For AND-nodes 
and OR-nodes we also define the sets of children. 



• If S{N) is independent under B{N), then is a leaf. 

• If S{N) is indivisible under B{N), then N is an OR-node. Each subgoal Bi in S{N) 
defines a child node whose set of subgoals is S{N) \ {Bi} and the binding set is 
B{N) U {Bi}. We call Bi the binder of the generated child. Note that the binding 
set of every node in a divisibility tree is the union of the binders of all its indivisible 
ancestors and of the root's binding set. 

• If S{N) is divisible under B{N), then N is an AND-node. Each subset Si in the 
divisibility partition DPart{S{N) , B{N)) defines a child node with associated set of 
subgoals Si and binding set B{N). Divisibility partition was defined in Section 3.2.1. 
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S(n2)={a,b}/;;^ 
B(n2)=0 \^ 



WB(nl)=0 

A 



b, c(X), d(X), e(X)} 




S(n3)={c(X), d(X), e(X)} 
B(n3) =0 



S(n4) = {d(X), e(X) ^ S(n6) = {c(X), d(X)} 

B(n4) = {c(X)}(n4; (nSj {n6) B(n6) = {e(X)} 

S(n5) = {c(X), e(X)} 
B(n5) = {d(X)} 



Figure 8: The divisibility tree of {a, h, c[X), d{X), e{X)} under empty initial binding set. The set 
associated with node nl is divisible, and is represented by an AND-node. Its children 
correspond to its divisibility subsets - one independent, S{n2) = {a, &}, and one indivis- 
ible, S{n3) = {c{X), d{X), e{X)}. n3 is an OR-node, whose children correspond to its 
three subgoals (each subgoal serves as a binder in one of the children). The sets S{n2), 
S{n4:), S{nb) and S{n6) are independent under their respective binding sets, and their 
nodes are leaves. Here we assumed that the subgoals c{X), d{X) and e{X) bind X as a 
result of their proof. 



It is easy to show that the divisibility tree of a set of subgoals is unique up to the order of 
children of each node. Figure 8 shows the divisibility tree of the set {a, b, c{X), d{X), e{X)} 
under empty initial binding set. The associated sets and binding sets are written next to 
the nodes. 

The following lemma expresses an important property of divisibility trees: subgoals of 
each node are independent of the rest of subgoals under the binding set of the node. 

Lemma 3 Let Sq be a set of subgoals. Then for every node N in DTree{So, 0), for every 
subgoal A G ^{N), and for every subgoal Y G So\{S{N)[J B{N)) , A andY are independent 
under B{N) . 

Proof: by induction on the depth of N in the divisibility tree. 

Inductive base: N is the root node, Sq \ S{N) is empty, and no such Y exists. 

Inductive hypothesis: The lemma holds for M, the parent node of N . 

Inductive step: Let A G S{N), Y ^ Sq \ {S{N) U B{N)). A G S{M), and for M the 
lemma holds, thus either A and Y are independent under B{M), or y G S{M). 

If A and Y are independent under B{M), then they are also independent under B{N), 
since B{M) C B{N). Otherwise, A and Y are dependent under B{M), and Y G S{M). 
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• If M is an AND-node, and A and Y are dependent under B{M), then A and 
Y belong to the same element of DPart{S{M) , B{M)) , and Y G S{N) - a 
contradiction. 

• If M is an OR-node and Y G S{M) \ S{N), then Y must be the binder of N. 
But then B{N) = B{M) U {Y} and Y G B{N) - a contradiction again. □ 

The lemma relates to subgoal independence inside divisibility trees. We shall sometimes 
need to argue about independence inside ordered sequences of subgoals. The following 
corollary provides the necessary connecting link. 

Corollary 2 Let Sq he a set of subgoals, N be a node in the divisibility tree of Sq, S an 
ordering of Sq, S = S'i||S'2, where B{N) C 5*1 and S{N) C 5*2. Then S{N) is mutually 
independent of S2 \ ^{N) under Si. 

Proof: Let A G S{N), Y e S2 \ S{N). A and Y are independent under B{N), by the 
preceding lemma. Since B{N) C Si^ A and Y are independent under 5*1. Every subgoal of 
S{N) is independent of every subgoal of S2\S{N) under 5*1; therefore, S{N) and S2\S{N) 
are mutually independent under 5*1. □ 

4.2 Valid Orderings in Divisibility Trees 

The aim of our ordering algorithm is to find a minimal ordering of a given set of subgoals. 
We construct orderings following a divide-and-conquer policy: larger sets are split into 
smaller ones, and orderings of the smaller sets are combined to produce an ordering of the 
larger set. To implement this policy, we perform a post-order traversal of the divisibility 
tree corresponding to the given set of subgoals under an empty initial binding set. When 
orderings of child nodes are combined to produce an ordering of the parent node, the inner 
order of their subgoals is not changed: smaller orderings are consistent with larger orderings. 

Definition: Let S and Q C S he sets of subgoals. An ordering Og of Q and an ordering 
Os of S are consistent (denoted Coras(OGj Os)); if the order of subgoals of Q in Oq and in 
Os is the same. 

The divide-and-conquer process described above seems analogous to Merge Sort (Knuth, 
1973). There, the set of numbers is split into two (or more) subsets, each subset is inde- 
pendently ordered to a sequence consistent with the global order, and these sequences are 
merged. Is it possible to use a similar method for subgoal ordering? Assume that a set 
of subgoals is partitioned into two mutually independent subsets, A and B. Can we build 
an algorithm that, given produces its ordering consistent with a minimal ordering of 
Ayj B^ independently of Bl Unfortunately, the answer is negative. An ordering of A may 
be consistent with a minimal ordering of U but at the same time not be consistent 
with a minimal ordering of U ^2 for some Bi ^ B2- 

For example, let A = {al{X), a2{X)}, Bi = {b}, B2 = {d} and the control values be as 
specified in Figure 9. The single minimal ordering of U is {a2{X) ,b, al{X)) , while the 
single minimal ordering of AL)B2 is {d, al{X), a2{X)). There is no ordering of A consistent 
with both these minimal global orderings. 
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The program: 



al(cl) . 


b ^ al(X) . 


al(cl) . 


b ^ d. 


a2(cl) . 


d. 


a2(cl) . 




a2(c2) f 


- al(c2) . 



The control values: 





al{X)U 


al(^)|{a2(X)| 


a2(X)|0 


a2(X)|{ai(x)j 


& 


cost 


2 


2 


5 


3 


5 1 


nsols 


2 


2 


2 


2 


3 1 



Cost{b,al{X),a2{X)) 


= 5- 


h3-2H 


h 3-2-3 = 


29 


Cost(rf,al(X),a2(X)) 


= 1 - 


hl-2- 


h 1-2-3 = 


9 


Cost{h,a2{X),al{X)) 


= 5- 


h3-5H 


h 3-2-2 = 


32 


Cost(rf,a2(X),al(X)) 


= 1 - 


hl-5- 


h 1-2-2 = 


10 


Cost{al{X),h,a2{X)) 


= 2- 


h2-5H 


h 2-3-3 = 


30 


Cost(al(X),rf,a2(X)) 


= 2- 


h2-l- 


h2-l-3 = 


10 


Cost{al{X),a2{X),h) 


= 2- 


h2-3H 


h 2-2-5 = 


28 


Cost(al(X),a2(X),rf) 


= 2- 


h2-3- 


h2-2-l = 


12 


Cost{g2lx),h,ql{X)) 


= 5- 


h2-5H 


h 2-3-2 = 


27 


Cost{a2{X),d,al{X)) 


= 5- 


h2-l- 


h2-l-2 = 


11 


Cost{a2{X),al{X),h) 


= 5- 


h2-2H 


h 2-2-5 = 


29 


Cost{a2{X),al{X),d) 


= 5- 


h2-2- 


h2-2-l = 


13 



Figure 9: We show a small program and the control values it defines. Then we compute costs of all 
permutations of the sets {h, al{X) , a2{X)} and {d, al{X), a2{X)}. Different orderings of 
{al{X), a2{X)} are consistent with minimal orderings of these sets. 



Since, unlike the case of Merge Sort, we cannot always identify a single ordering of the 
subset consistent with a minimal ordering of the whole set, our algorithm will deal with 
sets of candidate orderings. Our requirement from such a set is that it contain at least 
one local ordering consistent with a global minimal ordering, if such a local ordering exists 
("local" ordering is an ordering of the set of the node, "global" ordering is an ordering of 
the set of the root). Such a set will be called valid. The following definition defines valid 
sets formally, together with several other concepts. 

Definition: Let Sq be a set of subgoals and be a node in the divisibility tree of Sq. 
Recall that 7r{S) denotes the set of all permutations of S. 

1. O5 G vr((5o) is binder-consistent with Ojv G 7r{S{N)) (denoted BCN{dN,ds))i if they 
are consistent, and all subgoals of B{N) appear in O5 before all subgoals of O^: 

BCn{6n,6s) ^ ^Ob en{B{N)):Cons{dB\\6N,6s). 

ds G 7r((5o) is binder- consistent with the node N (denoted BCN{ds))i if it is binder- 
consistent with some ordering of S{N): 

BCN{ds) ^ ^On e 7r{S{N)) : 5Cjv(djv, Os). 

2. Ojv G tt{S{N)) is min- consistent with O5 G vr((5o) (denoted MCn^So{On,Os)), if they 
are binder-consistent, and O5 is minimal: 

MCnMOn^Os) ^ BCN{dN,ds)AMin{ds,So). 

On G 7r[S{N)) is min- consistent (denoted MCn^So{On))^ if it is min-consistent with 
some ordering of Sq: 

MCn,So{On) ^ 3ds G 7r(5o) : MCn,So{On,6s). 
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3. An ordering On G Tr{S{N)) is MC-contradicting, if it is not min-consistent: 

MCCjv,5o(Ojv) ^ -MCjv,5o(Ojv). 



4. Two orderings Ox^O^ G ^('^(A^)) are MC- equivalent^ if one of them is min-consistent 
iff the other one is: 

MCEAr,5o (01,02) ^ [MCjv,5o(Oi) ^ MCjv,5o (O2)] • 



5. A set of orderings Cjv C 7r((5(A^)) is fa/«(i, if Cjv contains a min-consistent ordering 
(when at least one min-consistent ordering of S{N^ exists): 

ya/^rf^^,5o(CJv) ^ [30^, G 7r(5(7V)) : MCjv,5o (O^v)] ^ POjv G Cjv : MCjv,5o (Ojv)]. 



An important property of valid sets is that a valid set of orderings of the root of 
DTree{So,9) must contain a minimal ordering of Sq. Indeed, in the root S{N) = Sq, 
and consistency becomes identity. Also, B{N) = 0, so that binder-consistency becomes 
consistency, and min-consistency becomes minimality. Since there always exists a minimal 
ordering of (5o, a valid set of orderings of the root must contain a minimal ordering of Sq. 

4.3 The Outline of the Divide-and-Conquer Algorithm 

We propose an algorithm that is based on producing valid sets of orderings. Each node in 
a divisibility tree produces a valid set for its associated set of subgoals, and passes it to 
its parent node. After the valid set of the root node is found, we compare costs of all its 
members, and return the cheapest one. 

The set of orderings produced by the algorithm for a node N is called a candidate set 
of N. Its members are called candidate orderings of N, or simply candidates. To find 
a candidate set of N, we first consider the set of all possible orderings of S{N) that are 
consistent with candidates of A^'s children. This set is called the consistency set of N . 
Given the candidate sets of A^'s children, the consistency set of N is defined uniquely. A 
candidate set of N is usually not unique. 

Definition: Let be a node in a divisibility tree of Sq. The consistency set of A^, denoted 
ConsSet{N) , and the candidate set of A^, denoted CandSet{N) , are defined recursively: 

• If A^ is a leaf, its consistency set contains all permutations of S{N): 

ConsSet{N) = tt{S{N)). 



• If A^ is an AND-node, and its child nodes are A^i, A^2j • • • A^fcj we define the consistency 
set of A^ as the set of all possible orderings of S{N) consistent with candidates of 
A^i,A^2,...A^fc: 

ConsSet{N) = jOjv G t^{<S{N)) Vi (1 < i < k), 30, G CandSet{N,) : Cons(d,,djv) 
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• If is an OR-node, and its child node corresponding to every binder A G ^{N) is 
Na, then the consistency set of N is obtained by adding binders as the first elements 
to the candidates of the children: 



• A candidate set of N is any set of orderings produced by removing MC-contradicting 
and MC-equivalent orderings out of the consistency set of N , while keeping at least 
one representative for each group of MC-equivalent orderings: 



(In other words, if some ordering is rejected, it is either MC-contradicting, or MC- 
equivalent to some other ordering, which is not rejected.) 

There are two kinds of orderings which can be removed from ConsSet{N) while re- 
taining its validity: MC-contradicting and MC-equivalent orderings. Removal of an MC- 
contradicting ordering cannot change the number of min-consistent orderings in the set; if 
we remove an MC-equivalent ordering, then even if it is min-consistent, some other min- 
consistent ordering is retained in the set. If there exists a min-consistent ordering of the set 
of the node, then its candidate set must contain a min-consistent ordering, and therefore 
the candidate set is valid. 

Note that when our algorithm treats an OR-node, the binder of each child is always 
placed as the first subgoal of the produced ordering of this node. On higher levels the inner 
order of subgoals in the ordering does not change (consistency is preserved). Therefore, 
our algorithm can only produce binder-consistent orderings. This explains the choice of 
the names "binder" and "binding set": the subgoals of B{N) bind some common variables 
of S{N)^ since they stand to the left of them in any global ordering that our algorithm 
produces. In particular, if S{N) is independent under B{N), then the subgoals of B{N) 
bind all the shared free variables of S{N). 

To implement the DPart function, we can use the Union-Find data structure (Cormen, 
Leiserson, & Rivest, 1991, Chapter 22), where subgoals are elements, and indivisible sets 
are groups. In the beginning, every subgoal constitutes a group by itself. Whenever we 
discover that two subgoals share a free variable not bound by subgoals of the binding set, 
we unite their groups into one. To complete the procedure, we need a way to determine 
which variables are bound by the given binding set. Section 7.1 contains a discussion of 
this problem and proposes some practical solutions. Finally, we collect all the indivisible 
subgoals into a separate group. These operations can be implemented in 0{na{n, n)) amor- 
tized time, where a{n, n) is the inverse Ackermann function, which can be considered 0(1) 
for all values of n that can appear in realistic logic programs. Thus, the whole process of 
finding the divisibility partition of n subgoals can be performed in 0{n) average time. 

The formal listing of the ordering algorithm discussed above is shown in Figure 10. 
The algorithm does not specify explicitly how candidate sets are created from consis- 
tency sets. To complete this algorithm, we must provide the three filtering procedures 




CandSet{N) C ConsSet{N), 
On e {ConsSet{N)\CandSet{N)) =^ MCCjv,5o(Ojv) V 



30 



e CandSet{N) : MCEjv,5o (Ojv, O^r) 
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Algorithm 5 

Order((5o) 

RootCandSet <— CandidateSet((5o, 0) 

Return the cheapest member of RootCandSet 

CandidateSet((5, B) 

case {S under B) 
independent: 

let ConsSet]\f <— 7r{S) 

let CandSet]\f <— ValidLeafFilter( ConsS'e^jv) 
divisible : 

let {Si,S2, ...Sk}^ DPart(5,S) 
loop for i = 1 to A; 

let Ci <— CandidateSet((5i, 
let ConsSetN ^ {Ojv G t^{S{N)) | Vi = 1 . . .A;, 30, G C, : Cons(d„djv)} 
let CandSetN ^ ValidANDFilter(Cons5'e^jv, {Si, . . .Sk}, {Ci, . . .Ck}) 
indivisible : 

loop for A £ S 

let C{A) ^ CandidateSet(5 \ {A}, B U {A}) 

let C'{A)^{A\\dA \dAeC{A)} 

let ConsSet]y <— Uyie5^'(^) 
let CandSet]\f <— ValidORFilter( ConsS'e^jv) 
Return CandSetN 



Figure 10: The skeleton of the DAC ordering algorithm. For each type of node in a divisibility tree, 
a consistency set is created and refined through validity filters. The produced candidate 
set of the root is valid; hence, its cheapest member is a minimal ordering of the given 
set. 



ValidLeafFilter, Valid ANDFilter and ValidORFilter. Trivially, we can define them 
all as null filters that return the sets they receive unchanged. In this case the candidate 
set of every node will contain all the permutations of its subgoals, and will surely be valid. 
This will, however, greatly increase the ordering time. Our intention is to reduce the sizes 
of candidate sets as far as possible, while keeping them valid. 

In the following two subsections we discuss the filtering procedures. Section 4.4 dis- 
cusses detection of MC-contradicting orderings, and Section 4.5 discusses detection of MC- 
equivalent orderings. Finally, in Section 4.6 we present the complete ordering algorithm, 
incorporating the filters into the skeleton of Algorithm 5. 
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4.4 Detection of MC-Contradicting Orderings 

In this subsection we show sufficient conditions for an ordering to be MC-contradicting. 
Such orderings can be safely discarded, leaving the set of orderings valid, but reducing its 
size. The subsection is divided into three parts, one for each type of node in a divisibility 
tree. 

4.4.1 Detection of MC-Contradicting Orderings in Leaves 

The following lemma shows that subgoals in a min-consistent ordering of a leaf node must 
be sorted by era. 

Lemma 4 

Let So be a set of subgoals, N be a leaf in the divisibility tree of Sq. Let On be an ordering of 
S{N). If the subgoals of On are not sorted by an under B{N) , then On is MC-contradicting. 

Proof: Let Os be any ordering of Sq, binder-consistent with On- We show that Os cannot 
be a minimal ordering of Sq, thus On is not min-consistent. 

On is not sorted by era, i.e., it contains an adjacent cn-inverted pair of subgoals {Ai, A2). 
(Recall that a pair is cn-inverted if the first element has a larger era value than the second 
one - Section 3.2.3). Since Os is consistent with On, we can write Os = ||y ||/l2||^, 

where X, Y and Z are (possibly empty) sequences of subgoals. Since Os is binder-consistent 
with On, B{N) C X. 

If Y is empty, then Ai and A2 are adjacent in Os- Since B{N) C X, Ai and A2 are 
independent under X. Therefore, the cost of the whole ordered sequence can be reduced 
by transposing Ai and A2, according to Lemma 2 (they are adjacent, independent and 
cn-inverted). 

If y is not empty, then no subgoal of y belongs to S{N), since otherwise it would appear 
in On between Ai and A2. By Corollary 2, Y is mutually independent of both Ai and A2 
under X. 

• If era(y)|^ < cn[Ai)\-g then, by Lemma 2, a transposition of y with Ai produces an 
ordering with lower cost. 

• Otherwise, era(y)|^ > cn[Ai)\-g. Since the pair {Ai,A2) is cn-inverted, cn[Ai)\-g > 
cn[A2)\-^. Hence, era(y)|^ > cn[A2)\-^, and transposition of y with A2 reduces the 
cost, by Lemma 2. 

In either case, there is a way to reduce the cost of O5. Therefore, O5 cannot be minimal, 
and On is MC-contradicting. □ 

4.4.2 Detection of MC-Contradicting Orderings in AND-nodes 

Every member of the consistency set of an AND-node is consistent with some combination 
of candidates of its child nodes. If there are k child nodes, and for each child Ni the sizes 
of subgoal and candidate sets are |(5(Aj)| = ra^ and \CandSet{Ni)\ = Ci, then the total 
number of possible consistent orderings is ei • e2 • . . .e^ • ^ Fortunately, most 

Til •''^2 ''^k • 

of these orderings are MC-contradicting and can be discarded from the candidate set. The 
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following lemma states that it is forbidden to insert other subgoals between two cn-inverted 
sub-sequences. If such insertion takes place, the ordering is MC-contradicting and can be 
safely discarded. 

Lemma 5 

Let So be a set of subgoals, N a node in the divisibility tree of Sq, and Os an ordering of 
Sq, binder-consistent with an ordering On of S{N). 

If On contains an adjacent cn-inverted pair of sub-sequences {Ai, A2), Ai and A2 appear 
in Os not mixed with other subgoals, and Ai and A2 are not adjacent in Os, then Os is 
not minimal. 

Proof: Let O5 be such an ordering of Sq, binder-consistent with On- 

6s = X\\Ai\\Y\\A2\\Z, 

where Y is not empty. No subgoal of Y belongs to S{N), since otherwise it would stand 
in On between Ai and A2. Os is binder-consistent with On', therefore, B{N) C X. By 
Corollary 2, Y must be mutually independent of both Ai and A2 under X, and by Lemma 2 
a transposition of Y with either Ai or A2 reduces the cost - exactly as in the proof of 
Lemma 4. □ 

If a pair of adjacent subgoals (Aj-, is cn-inverted, then by the previous lemma any 

attempt to insert subgoals inside it results in a non-minimal global ordering. Thereupon 
we may join Ai and into a block Aj-^j+i, which can further participate in a larger block. 
The formal recursive definition of a block follows. For convenience, we consider separate 
subgoals to be blocks of length 1. 

Definition: 

1. A sub-sequence A of an ordered sequence of subgoals is a block if it is either a single 
subgoal, or A = Ai\\A2, where {Ai, A2) is a cn-inverted pair of blocks. 

2. A block is maximal (max-block) if it is not a sub-sequence of a larger block. 

3. Let be a node in a divisibility tree, M be some descendant of A^, On G ^{'^{X)) 
and Om G '^i'^i^)) t'e two consistent orderings of these nodes. A block A of Om is 
violated in On if there are two adjacent subgoals in A that are not adjacent in On (in 
other words, alien subgoals are inserted between the subgoals of the block). 

4. Let A^ be a node, M be its descendant. On G ^{'^{X)) and Om G ^{'^{^)) t'e two 
consistent orderings of these nodes. Om is called the projection of On on M. We 
shall usually speak about projection of an ordering on a child node. 

The concept of max-block is similar to the maximal indivisible block introduced by Simon 
and Kadane (1975) in the context of satisficing search. The following corollary presents the 
result of Lemma 5 in a more convenient way. 

Corollary 3 Let N be a node in a divisibility tree, M be one of its children. On be an 
ordering of N , and Om be the projection of On on M . If Om contains a block that is 
violated in On, then On is MC-contradicting. 



63 



Ledeniov & Markovitch 



Proof: Let A be the smallest block of Om violated in Ojy. According to the definition 
of a block, A = Ai\\A2, where Ai and A2 are not violated in On, and the pair {Ai,A2) 
is cn-inverted. Let O5 be any ordering of the root node binder-consistent with On. Os 
violates A, since On violates A. To show that On is MC-contradicting, we must prove that 
Os is not minimal. 

• If Ai and A2 are not violated in O5, then they are not adjacent in O5, and O5 is not 
minimal, by Lemma 5. 

• Otherwise, Ai or A2 is violated in O5. Without loss of generality, let it be Ai. Let A' 
be the smallest sub-block of Ai violated in Os- According to the definition of a block, 
A' = A[\\A2, where the pair {A[, A2) is cn-inverted, Ai and A2 are not violated and 
not adjacent in Os- By Lemma 5, Os is not minimal. □ 

For example, if control values of subgoals are as shown in Figure 9, then (al(A), a2{X)) 
is a block, since cn(al(A))|0 = = ^, cn(a2(X))\^^i(^x)} = = 5- ^s one can see from 
the figure, insertion of 5 or c? inside this block results in a non-minimal ordering. 

As was already noted above, the consistency set of an AND-node can be large. In 
many of its orderings, however, blocks of projections are violated, and we can discard 
these orderings as MC-contradicting. In the remaining orderings, no block of a projection 
is violated, and each such ordering can be represented as a sequence of max-blocks of the 
projections. In each projection, its max-blocks stand in cra-ascending order (otherwise, there 
is an adjacent cn-inverted pair of blocks, and a larger block can be formed, which contradicts 
their maximality). As the following lemma states, in the parent AND-node these blocks 
must also be ordered by their era values; otherwise, the ordering is MC-contradicting. 

Lemma 6 If an ordering of an AND-node contains an adjacent cn-inverted pair of max- 
blocks of its projections on the children, then this ordering is MC-contradicting. 

Proof: If these blocks are violated in the binder-consistent global ordering, the global 
ordering is not minimal by Corollary 3. If the blocks are not violated, the proof is similar 
to the proof of Lemma 4. □ 

The two sufficient conditions for detection of MC-contradicting orderings expressed in 
Corollary 3 and Lemma 6 allow us to reduce the size of the candidate set significantly. 
Assume, for example, that the set of our current node N is split into two mutually indepen- 
dent subsets whose candidates are (ai, 02) and 62) (one candidate for each child). There 
are six possible orderings of S{N), all shown in Figure 11. Assume that both (ai,a2) and 
(61,62) are blocks, and cra((ai, 02)) |b(jv) < cra((6i, 62)) |b(jv)- Out of six consistent orderings, 
four (2-5) can be rejected due to block violation, and one of the remaining two (number 6) 
puts the blocks in the wrong order. So, only one ordering (number 1) can be left in the can- 
didate set of N . Even if neither (ai, 02) nor (61, 62) are blocks. Lemma 6 dictates a unique 
interleaving of their elements (max-blocks), assuming that cra(ai)|g(jv) / cra(a2) |B(jv)u{ai} 
/ cra(6i)|B(jv) / cra(62)|B(jv)u{6i}- 

4.4.3 Detection of MC-Contradicting Orderings in OR-nodes 

The following lemma states that if a block has a cheaper permutation, then the ordering is 
MC-contradicting (and can be discarded from the candidate set). 
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Figure 11: The possible ways to combine (ai, 3,2) and (bi, h'^) 



Lemma 7 Let N he a node in the divisibility tree of Sq, Ojv G 7r[S{N)). Let A be a leading 
block of On: On = A\\R. If there is a permutation of A, A', such that cost(/l') |g(jv) < 
cost(A) |g(jv); then On is MC-contradicting. 

Proof: Let O5 G vr(S'o) be binder-consistent with On- If A is violated in O5, O5 cannot 
be minimal (Corollary 3). Otherwise, A occupies a continuous segment in O5, and its 
replacement by a cheaper permutation reduces the cost of the global ordering (Lemma 1). 
Thus, Os cannot be minimal. □ 

This check should be done only for leading blocks of OR-nodes: 

• Every ordering of a leaf node that has not been rejected due to Lemma 4 must be 
sorted by era. Consequently, it contains no cn-inverted adjacent pair of subgoals, and 
no block of size > 2 can be formed. 

• Every ordering of an AND-node that has not been rejected due to Corollary 3 or 
Lemma 6 must have its blocks unbroken and in cn-ascending order. Consequently, 
new blocks cannot be formed here either. 

• In OR-nodes, new blocks can be formed when we add a binder as the first element of 
an ordering, if the era value of the binder is greater than that of the subsequent block. 
All new blocks start from the binder, and we must perform the permutation test only 
on the leading max-block of an ordering. 

4.5 Detection of MC-Equivalent Orderings 

In the previous subsection we presented sufficient conditions for detecting MC-contradicting 
orderings. In this subsection we specify sufficient conditions for identifying MC-equivalent 
orderings. Recall that two orderings of a node are MC-equivalent if minimal consistency 
of one implies minimal consistency of the other. Finding such sufficient conditions will 
allow us to eliminate orderings without loss of validity of the candidate set. We start 
with defining a specialization of the MC-equivalence relation: blockwise equivalence, We 
then show that orderings whose max-blocks are sorted by era are blockwise-equivalent, and 
therefore MC-equivalent. 
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Definition: Let Sq be a set of subgoals and be a node in the divisibility tree of Sq. Let 
Oi and O2 be two orderings of S{N) with an equal number of max-blocks. Let O5 be an 
ordering of Sq, binder-consistent with Oi, where blocks of Oi are not violated. 

05!^^ is the ordering obtained by replacing in O5 every max-block of Oi with a max- 

block of O2, while preserving the order of max-blocks (the i-th max-block of Oi is replaced 
by the i-th max-block of 02)- 

Oi and O2 are blockwise-equivalentif the following condition holds: Oi is min-consistent 
with Os iff O2 is min-consistent with Osl^^. 

As can be easily seen, if two orderings are blockwise-equivalent, then they are MC- 
equivalent. Now we show that a transposition of adjacent, mutually independent cn-equal 
max-blocks in an ordering of a node produces a blockwise-equivalent ordering. The proof 
of the following lemma is found in Appendix A. 

Lemma 8 

Let So be a set of subgoals, N be a node in the divisibility tree of Sq, On = ||/l2||i? be 

an ordering of S{N), where Ai and A2 are max-blocks, mutually independent and cn-equal 
under the bindings of B{N) UQ. Then On is blockwise-equivalent with O'jy = Q 11^211^1 

Corollary 4 All sorted by cn orderings of a leaf node are blockwise-equivalent. 

For example, [fS{N) = {A, B, C, D}, cn(A) |b(jv) = 0.1, cn(5) |b(jv) = cn(C) |b(jv) = 0.3, 
cra(_D) |g(jv) = 0.5, then the orderings {A, B, C, D) and (A, C, -B, D) are blockwise-equivalent, 
and we can remove from the candidate set any one of them (but not both). 

Corollary 5 All orderings of an AND-node, where blocks of projections are not violated 
and adjacent max-blocks from different children projections are cn-ordered, are blockwise- 
equivalent. 

For example, if the candidates of the children are A\\B and CH-D, where A, _B, C, D are 
max-blocks, cra(i*)|B(jv) = 0.1, cn{B)\^^j^^^j^ = cn{C)\B(N) = 0.3 and cn{D)\^^j^^^^ = 0.5, 

then the orderings /l||_B||C||-D and /l||C||-B||-D are blockwise-equivalent, and we can remove 
from the candidate set any one of them (but not both). 

To prove both Corollaries 4 and 5, we note that in each case one of the mentioned 
orderings can be obtained from the other by a finite number of transpositions of adjacent, 
mutually independent and cn-equal max-blocks. According to Lemma 8, each such transpo- 
sition yields a blockwise-equivalent ordering. It is easy to show that blockwise equivalence 
is transitive. 

The following corollary states that subgoals within a block can be permuted, provided 
that the cost of the block is not changed. 

Corollary 6 All orderings of a node, identical up to cost-preserving permutations of sub- 
goals inside blocks, are blockwise-equivalent. 

The proof of the corollary follows immediately from Lemma 1. For example, if the set 
is {a(X), 5(X)}, and the control values are as in the first counter-example of Proposition 1, 
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Node 


Set 


MC-contradicting 


blockwise-equivalent 


Leaf 


Independent 


Subgoals not sorted by era 

IJV^ 1 1 11 1 lUi XL 


Subgoals sorted by era 


AND 


Divisible 


Contains violated blocks 

— Corollary 3 
Max-blocks not sorted by era 

— Lemma 6 


Max-blocks not violated, 
sorted by era 
— Corollary 5 


OR 


Indivisible 


The leading max-block has 
a cheaper permutation 
— Lemma 7 


Cost-preserving permutations 

of blocks 

— Corollary 6 



Table 1: Summary of sufficient conditions for detection of MC-contradicting and blockwise- 
equivalent orderings. 



i.e. era(a(X)|0) = era(5(X)|0) = i, and cn{a{X)\s^-h(^x)}) = cn{b{X)\s^cH^x)}) = 0, then in both 
possible orderings, (a(X), h{X)) and a(X)), the two subgoals are united into a block, 

and these blocks have equal cost. In any global ordering containing the block (a(X), 
we can replace this block with (5(X),a(X)) without changing the total cost. Therefore 
(a(X),5(X)) is blockwise-equivalent to (5(X),a(X)). 

The sufficient condition expressed in Corollary 6 should be checked only in OR-nodes, 
since in leaves and AND-nodes no new blocks are created, as was argued in Section 4.4.3. 

4.6 The Revised Ordering Algorithm 

In the two preceding subsections we saw several sufficient conditions of MC-contradiction 
and MC-equivalence, summarized in Table 1. These results permit us to close the gaps 
in Algorithm 5 by providing the necessary validity filters. Each filter tests the sufficient 
conditions of MC-contradiction and MC-equivalence on every ordering in the consistency 
set. If some of these sufficient conditions hold, the ordering is rejected. The formal listing 
of these procedures is shown in Figure f2. 

While the generate-and-test approach described above served us well for methodological 
purposes, it is obviously not practical because of its computational limitations. For example, 
for an independent set of size ra, the algorithm creates ra! orderings, then rejects ra! — f 
and keeps only one. This process takes 0(ra! • ra) time and produces an ordering which 
is sorted by era. The same result could be obtained in just O(ralogra) time, by a single 
sorting. So, instead of uncontrolled creation of orderings and selective rejection, we want to 
perform a selective creation of orderings. In other words, we want to revise our algorithm to 
deal directly with candidate sets, instead of generating large consistency sets. The revised 
algorithm produces the candidate set of a node N as follows: 

• If is a leaf, the subgoals of S{N) are sorted by era under the bindings of B{N), and 
the produced ordering is the sole candidate of N . 

• If A^ is an AND-node, then for each combination of its children's candidates a candi- 
date of A^ is created, where the max-blocks of the children's candidates are ordered 
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ValidLeafFilter ( ConsS'e^jv) 
let CandSet]\f <— 
loop for Ojv G ConsSetN 
if Ojv is sorted by era 

and there is no O'j^ G CandSet^ which is sorted by era 

then CandSet^ <— CandSet^ U {On} 
Return CandSetN 

ValidANDFilter(Cons5'e^jv, {Si, . . .Sk}, {Ci, . . .Ck}) 
let CandSetN <— 
loop for Ojv G ConsSetN 
loop for i = 1 to A; 

let Oi be the projection of Ojv on Si 
if Vi G 

and max-blocks of Oi-s are not violated in OjVj 
and max-blocks of Oi-s are ordered by era in Ojv, 
and there is no O'j^ G CandSetN consistent with all Oi-s, 
then CandSetN <— CandSetN U {Oat} 
Return CandSetN 

ValidORFilter (ConsSetN) 
let CandSetN <— 
loop for Ojv G ConsSetN 

if Ojv does not start with a block having a cheaper permutation, 
and there is no G CandSetN, identical to Ojv up to 

cost-preserving permutations in blocks, 
then CandSetN <— CandSetN U {On} 
Return CandSetN 



Figure 12: The three filter procedures that convert a consistency set into a candidate set. Together 
with Algorithm 5, they form a complete ordering algorithm. The efficiency of the 
algorithm can be improved, as we shall see in Algorithm 6. 



by era. The candidate is produced by merging: moving in parallel on the candidates 
of the children and extracting max-blocks that are minimal by era. 

• If is an OR-node, then for each candidate of its child an ordering of N is created 
by adding the binder to the left end of the child candidate. If this results in creation 
of a block that has a cheaper permutation, the ordering is rejected; otherwise, it is 
added to the candidate set. It suffices to check only the leading max-block. 
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Note that the revised algorithm does not include a test for cost-preserving permutations 
of blocks in different orderings (expressed in Corollary 6), because of the high expense of 
such a test. 

The revised algorithm described above contains manipulations of blocks. For this pur- 
pose, we need an easy and efficient way to detect blocks in orderings. Since we do not 
permit block violation (by Corollary 3), we can unite all the subgoals of a max-block into 
one entity, and treat it as an ordinary subgoal. The procedure of joining subgoals into 
blocks is called folding, and the resulting sequence of max-blocks - a folded sequence. After 
subgoals are folded into a block, there is no need to unfold this block back to separate 
subgoals: on upper levels of the tree, these subgoals will again be joined into a block, unless 
the block is violated. The unfolding operation is carried out only once before returning the 
cheapest ordering of the set (of the root node). The candidate sets of the nodes are now 
defined as sets of folded orderings. 

As was already stated, new blocks can only be created in the candidates of OR-nodes, 
when the binder is added as the first element of the ordering, if the era value of the binder 
is greater than the era value of the first max-block of the child projection. Therefore, in the 
revised algorithm we only build new blocks that start from the binder: the max-blocks in 
the rest of the ordering remain from the child's candidate. First we try to make a block 
out of the binder and the first max-block of the child's candidate. If they are cn-ordered, 
we stop the folding. If they are cn-inverted, we unite them into a larger block, and try 
to unite it with the second max-block of the child's candidate, and so on. The produced 
folded ordering contains only maximal blocks: the first block is maximal, since we could not 
expand it further to the right, and the other blocks are maximal, since they were maximal 
in the child's candidate. 

Lemma 7 states that an ordering whose leading max-block has a cheaper permutation 
is MC-contradicting. One way to detect such a block is to exhaustively test all its permu- 
tations, computing and comparing their costs. This procedure is very expensive. Instead, 
in our revised algorithm we employ the adjacency restriction test (Equation 8). The test is 
applied to every pair of adjacent subgoals of a block, and if some adjacent pair has a cheaper 
transposition, then the whole block has a cheaper permutation, by Lemma 1. Since blocks 
are created by concatenation of smaller blocks, it suffices to test the adjacency restriction 
only at the points where blocks are joined (for other adjacent pairs of subgoals, the tests 
were performed on the lower levels, when smaller blocks were formed). The adjacency re- 
striction test does not guarantee detection of all not-cheapest permutations (as was shown 
in Example 3), but it detects such blocks in many cases, and works in linear time. 

The final version of the DAC subgoal ordering algorithm is presented in Figure 13. The 
complete correctness proof of Algorithm 6 is found in Appendix B. 

4.7 Sample Run and Comparison of Ordering Algorithms 

We illustrate the work of the DAC algorithm, using the subgoal set shown in Figure 8, 
So = {a,b,c{X),d{X),e{X)}. After proving c{X), d{X) or e{X), we can assume that X is 
bound. Let the control values for the subgoals be as shown in Table 2. The column e(/ree) 
contains control values for the subgoal c{X) when X is not yet bound by the preceding 
subgoals (i.e., the binding set does not contain d{X) or e{X)). The column c{bound) 
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Algorithm 6 ; The Divide-and-Conquer Algorithm 

Order((5o) 

let RootCandSet ^ CandidateSet((5o, 0) 

Return Unfold (the cheapest element of RootCandSei) 

CandidateSet((5, B) 

let {Si,S2, ...Sk}^ DPart(5,S) 

case 

• A; = 1, shared-vars{Si) = (5 is independent under B): 

Return {Sort-by-cn((5, S)} 

• A; = 1, shared-vars{Si) / (5 is indivisible under B): 

loop for A £ S 

let C{A) ^ CandidateSet(5 \ {A}, B U {A}) 
let C'{A) ^ \Yo\d{A\\dA, B) 6 a G } 

Return [^AesC'{A) 

• k > 1 [S \s divisible under B): 

loop for i = 1 to A; 

let Ci <— CandidateSet((5i, 
Return {Merge({di, ©2, . . .Ofc}, ^) | Oi G Ci, ©2 G C2, . . .Ofc G Cfc } 

Merge({di,d2,...0fc},^) ^ 

let min-cn- candidate <— Oi that minimizes cra(first-max-block(Oj)) |b, 1 < i < A; 
let min-cn-block <— first-max-block (mm-cn-can(i«(iate) 
remove-first-max-block (mm-cn-can(i«(iate) 

Return mm-cn-5/ocA||Merge({Oi, O2, . . -Ofc}, ^ U min-cn-block) 

Fo\d{{Ai,A2...Ak),B) 

if A; < 1 or cra(Ai)|B < cn{A2)\B\\A, 

then Return (Ai, A2 ■ ■ ■ A^) 

else 

if the last subgoal of Ai and the first subgoal of A2 satisfy the adjacency restriction 
then 

let A' ^ block(Ai, A2) 
Return Fo\d{{A' , A3 . . . Ak) , B) 
else Return 



Figure 13: The revised version of the DAC algorithm. The candidate sets are built selectively, 
without explicit creation of consistency sets. Candidate sets contain folded orderings, 
and unfolding is performed only on the returned global ordering. The code of the 
Unfold and Sort-by-cn procedures is not listed, due to its straightforwardness. The 
merging procedure recursively extracts from the given folded orderings max-blocks that 
are minimal by cn. The folding procedure joins two leading blocks into a larger one, as 
long as they are cn-inverted. 
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a 


b 


c(free) 


c(bound) 


d(free) 


d(bound) 


e(free) 


e(bound) 


cost 


10 


5 


5 


5 


10 


5 


20 


10 


risols 


0.8 


2 


2 


0.5 


4 


1 


0.4 


0.1 


cn 


-0.02 


0.2 


0.2 


-0.1 


0.3 





-0.03 


-0.09 



Table 2: Control values for the sample runs of the ordering algorithms. 



contains cost values of c{X) when d{X) or e{X) have already bound X. For example, 
cost{c{X))\i^^^^(^X)} = c'ost{c{bound)) = 5. The DAC algorithm traverses the divisibility tree 
of So as follows. (The names of the nodes are as in Figure 8.) 

1. The root of the divisibility tree, ral, has empty binding set B{nl) = 0, and the 
associated subgoal set S{nl) = {a,b,c{X),d{X),e{X)}. The set S{nl) is parti- 
tioned into two subsets under B{nl): one independent - {a,b}, and one indivisible - 
{c{X),d{X),e{X)}. These two subsets correspond to two child nodes of the AND- 
node ral: ra2 and ra3, both with empty binding sets. 

2. S{n2) is independent under B{n2). Therefore, ra2 is a leaf, and its sole candidate 
ordering is obtained by sorting its subgoals by era under B{n2). cra(a)|0 = —0.02, 
cra(5)|0 = 0.2, thus CandSet{n2) = {{a, b)}. 

3. (5(ra3) is indivisible under ^(ra3). Therefore, ra3 is an OR-node, and its three children 
are created - one for each subgoal of S{n3) serving as the binder. 

• Binder c{X) yields the child node ra4 with the associated set S{n4) = {d{X), e{X)} 
and the binding set ^(ra4) = {c{X)}. S{n4) is independent under ^(ra4). There- 
fore, ra4 is a leaf, and its sole candidate is obtained by sorting its subgoals by 
era: 

cn{d{X))\{,(x)} = 0, cra(e(X))|{,(x)} = -0.09; 
thus, the candidate of ra4 is {e{X) , d{X)) . 

• Binder d{X) yields the child node ra5 with the associated set S{n5) = {c{X), e{X)} 
and the binding set B{n5) = {d{X)}. S{n5) is independent under B{n5), and its 
sorting by era produces the candidate {c{X) , e{X)) . 

• Binder e{X) yields the child node ra6 with the associated set S{n6) = {c{X), d{X)} 
and the binding set B{n6) = {e{X)}. S{n6) is independent under B{n6), and its 
sorting by era produces the candidate {c{X),d{X)). 

4. We now add each binder to its corresponding child's candidate and obtain three order- 
ings of the OR-node ra3: {c{X),e{X),d{X)), {d{X),c{X),e{X)), {e{X),c{X),d{X)). 

5. We now perform folding of these orderings and check violations of the adjacency 
restriction, in order to determine whether a block has a cheaper permutation. 
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• First, we perform the folding of {c{X),e{X),d{X)). The pair {c{X) , e{X)) is 
cn-inverted: cra(c(X))|0 = 0.2, cn{e{X))\^^(^x)} = —0.09. We thus unite it into a 
block. This block does not pass the adjacency restriction test (Equation 8): 

c-ost{{c{X),e{X)))\$ = 5 + 2-10 = 25, 
c-ost{{e{X),c{X)))\$ = 20 + 0.4-5 = 22. 

Therefore, this ordering is MC-contradicting and can be discarded. 

• We perform the folding of (rf(X),c(X),e(X)). cn{d{X))\j, = 0.3, cra(c(X)) |_rd(x)} = 
— 0.1, the pair is cn-inverted, and we unite it into a block. This block does not 
pass the adjacency restriction test: 

cost{{d{X),c{X)))\$ = 10 + 4-5 = 30, 
cost{{c{X),d{X)))\$ = 5 + 2-5 = 15. 

This ordering is rejected too, even before its folding is finished. If we continue 
the folding process, we shall see that the subgoal e{X) must also be added to this 
block, since cn{{d{X) , c{X)))\j, = = 0.0333, and cn{e{X))\^d(x),c(x)) = 

-0.09. 

• We perform the folding of (e(X),c(X),rf(X)). cra(e(X))|0= -OM, cn{c{X))\_f^^^x)} 
= —0.1, the pair is cn-inverted, and we form a block ec{X) = (e(X), c{X)), which 
passes the adjacency restriction test: 

cost{{e{X),c{X)))\$ = 20 + 0.4-5 = 22, 
c-ost{{c{X),e{X)))\$ = 5 + 2-10 = 25. 

We compute the control values of the new block: 

cost{ec{X))\j, = 20 + 0.4 - 5 = 22 
nso/s(ec(X))|0 = 0.4-0.5 = 0.2 

cn(ec(X))|0 = ^ = -0.0363636 

cra((i(X)) |_{-gc(x)} = 0, thus the pair (ec(X), d{X)) is cn-ordered, no more folding 
is needed, and we add the folded candidate {ec{X) , d{X)) to the candidate set 
of ra3. 

6. We now perform merging of the candidate set of ra2, {(a, 5)}, with the candidate set 
of ra3, {{ec{X),d{X))}. In the resulting sequence max-blocks must be sorted by era. 

cra(a) = -0.02, cra(5) = 0.2, cn{ec{X))\^ = -0.0363636, cn{d{X))\^^^^x)} = 0. 
The merged ordering, (ec(X), a, d{X), b), is added to the candidate set of ral. 

7. We compare the costs of all candidates of ral, and output the cheapest one. In our case, 
there is only one candidate, {ec{X) , a, d{X) ,b) . The algorithm returns this candidate 
unfolded, {e{X),c{X),a,d{X),b). 
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Cheapest prefix 


Extension/Completion 


Cost 





(a) 
(b) 

{d(X)) 
{e(X)) 


10 

5 

5 

10 
20 


(h) 


(h a) 
(b,c(X)) 
{h,d{X)) 
{h,e{X)) 


the adjacency restriction test fails 
5 + 2 • 5 = 15 
5 + 2-10 = 25 
the adjacency restriction test fails 


(c(X)) 


(c(X),e(X),a, d(X),h) 


5 + 2(10 + 0.1(10 + 0.8(5 + 1 • 5))) = 28.6 


(a) 

\ / 


(a,h) 
(a,c{X)) 
{a,d{X)) 
{a,e{X)) 


10 + 0.8 • 5 = 14 
10 + 0.8 • 5 = 14 
10 + 0.8 • 10 = 18 

the adjacency restriction test fails 


{d{X)) 


{d(X),c(X),e(X),a,h) 


10 + 4(5 + 0.5(10 + 0.1(10 + 0.8 • 5))) = 52.8 


{a,h) 


{a,h,c{X)) 
{a,b,d(X)) 
{a,h,e{X)) 


14 + 0.8 •2-5 = 22 
14 + 0.8 • 2 • 10 = 30 

the adjacency restriction test fails 


{a,c(X)) 


{a,c(X),e(X),d(X),b) 


14 + 0.8-2(10 + 0.1(5 + 1 -5)) = 31.6 


{h,c{X)) 


{h,c{X),e{X),a,d{X)) 


15 + 2 • 2(10 + 0.1(10 + 0.8 • 5)) = 60.6 


{a,d{X)) 


{a,d(X),c(X),e(X),b) 


18 + 0.8 • 4(5 + 0.5(10 + 0.1 • 5)) = 50.8 




{e{X),c{X),a,d{X),h) 


20 + 0.4(5 + 0.5(10 + 0.8(5 + 1 • 5))) = 25.6 


{a,b,c{X)) 


{a,b,c(X),e(X),d(X)) 


22 + 0.8 • 2 • 2(10 + 0.1 • 5) = 55.6 


{h,d{X)) 


{h,d{X),c{X),e{X),a) 


25 + 2 • 4(5 + 0.5(10 + 0.1 • 10)) = 109 


{e{X),c{X),a,d{X),b) 


complete ordering 





Table 3: A trace of a sample run of Algorithm 4 on the set of Figure 8. The left column shows the 
cheapest prefix extracted from the list on each step, the middle column - its extensions 
or completions that are added to the list, and the right column - their associated costs. 



For comparison, we now show how the same task is performed by Algorithm 4. The 
algorithm maintains a list of prefixes, sorted by their cost values, and which initially contains 
an empty sequence. On each step the algorithm extracts from the list its cheapest element, 
and adds to the list the extensions or completions of this prefix. Extensions are created when 
the set of remaining subgoals is dependent, by appending each of the remaining subgoals 
to the end of the prefix. Completions are created when the set of remaining subgoals is 
independent, by sorting them and appending the entire resulting sequence to the prefix. An 
extension is added to the list only when the adjacency restriction test succeeds on its two 
last subgoals. To make the list operations faster, we can implement it as a heap structure 
(Gormen et al., 1991). 

The trace of Algorithm 4 on the set Sq is shown in Table 3. The left column shows the 
cheapest prefix extracted from the list on each step, the middle column - its extensions or 
completions that are added to the list, and the right column - their associated costs. 

It looks as if the DAC algorithm orders the given set Sq more efficiently than Algorithm 4. 
We can compare several discrete measurements to show this. For example. Algorithm 6 
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Figure 14: An example of the worst case for ordering. Wiien all variables are initially free, every 
subset of subgoals is indivisible under the binding of the rest of subgoals, and the overall 
complexity of ordering by Algorithm 6 is 0{n\). 



performs 4 sorting sessions, each one with 2 elements, while Algorithm 4 performs 5 sortings 
with 2 elements, and 3 sortings with 3 elements. The adjacency restriction is tested only 3 
times by Algorithm 6, and 11 times by Algorithm 4. Algorithm 6 creates totally 8 different 
ordered sub-sequences, with total length 22, while Algorithm 4 creates 24 ordered prefixes, 
with total length 55. 

4.8 Complexity Analysis 

Both Algorithm 4 and Algorithm 6 find a minimal ordering, and both sort independent 
subsets of subgoals whenever possible. Algorithm 6, however, offers several advantages due 
to its divide-and-conquer strategy. 

Let n be the number of subgoals in the initial set. For convenience, we assume that 
the time of computing the control values for one subgoal is 0(1); otherwise, if this time 
is r, all the complexities below must be multiplied by r. The worst case complexity of 
Algorithm 6 is 0{nl). Figure 14 shows an example of such a case for n = 5. In this set 
every two subgoals share a variable that does not appear in other subgoals. Thus, other 
subgoals cannot bind it. The set of the root is indivisible, and no matter which binder is 
chosen, the sets of the children are indivisible. So, in each child of the root, we must select 
every remaining subgoal as the binder, and so on. The overall complexity of this execution 
is 0{nl). This is indeed the worst-case complexity: presence of AND-nodes in the tree can 
only reduce it. 

Note that even when n is small, such a complex rule body with (2) free variables is 
very improbable in practical programs. Also, the worst-case complexity can be reduced 
to 0{n^ X 2"), if we move from divisibility trees to divisibility graphs (DAGs), where all 
identical nodes of a divisibility tree (same subgoal set, same binding set) are represented 
by a single vertex. The equivalence test of the tree nodes can be performed efficiently with 
the help of trie structures (Aho et al., 1987), where subgoals are sorted lexicographically. 

Let there be n subgoals, with v shared variables appearing in m subgoals. As was 
already noted in Section 4.3, the partition of subgoals into subsets can be performed in 
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0{n) average time, using a Union-Find data structure (Gormen et al., 1991, Chapter 22). 
In the worst possible case, there are no AND-nodes in the divisibility tree, apart from the 
root node (whose set is divisible into a dependent set of size m and an independent set of 
size n — m). The overall complexity of the DAC algorithm in such a case is 



where k is the maximal possible number of bindings performed before the remaining subset 
is independent. If we assume that every subgoal binds all its free variables (which happens 
very frequently in practical logic programs), then k = min{f , m — 1}; otherwise k = m — 1. 
k is equal to the maximal number of OR-nodes on a path from the root to a leaf of the 
divisibility tree. Therefore, the height of the divisibility tree is limited by k -\- 1. Actually, 
the tree can be shallower, since some binders can bind more than one shared variable each. 
This means that the number of shared variables can decrease by more than 1 in each OR- 
node. Below we simplify the above formula for several common cases, when k is small and 
when the abovementioned assumption holds (every subgoal binds all its free variables after 
its proof terminates). 

• If V < m <^ n: T{n, m, v) = 0{n ■ m" + n ■ log n) 

• If m < V <^ n: T[n, m, v) = 0{n ■ m™"^ + n ■ log n) 

• If f ^ m ~ ra: T[n, m, v) = 0{n^'^^ ■ log n) 

• If m ^ f ~ ra: T{n, m, v) = 0{n ■ ml -\- n ■ log n) 

Generally, for a small number v of shared variables, the complexity of the algorithm is 
roughly bounded by 0{n^'^^ ■ logra). In particular, if all subgoals are independent [v = 0), 
the complexity is 0(ra logra). In most practical cases, the number of shared free variables 
in a rule body is relatively small, and every subgoal binds all its free variables; therefore, 
the algorithm has polynomial complexity. Note that even if a rule body in the program 
text contains many free variables, most of them usually become bound after the rule head 
unification is performed (i.e., before we start the ordering of the instantiated body). 

5. Learning Control Knowledge for Ordering 

The ordering algorithms described in the previous sections assume the availability of correct 
values of average cost and number of solutions for various predicates under various argument 
bindings. In this section we discuss how this control knowledge can be obtained by learning. 

Instead of static exploration of the program text (Debray & Lin, 1993; Etzioni, 1993), 
we adopt the approach of Markovitch and Scott (1989) and learn the control knowledge 
by collecting statistics on the literals that were proved in the past. This learning can be 
performed on-line or off-line. In the latter case, the ordering system first works with a 
training set of queries, while collecting statistics. This training set can be built on the 



r(ra, m, v) 



= 0(ra) 

-|- 0((ra — m) log(ra — m)) 

+ Omlo{^-^))-^og{m-k)) 



divisibility partition 

ordering of independent subgoals 

ordering of dependent subgoals 

folding 

merging 
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distribution of user queries seen in the past. We assume that the distribution of queries 
received by the system does not change significantly with time; hence, the past distribution 
directs the system to learn relevant knowledge for the future queries. 

While proving queries, the learning component accumulates information about the con- 
trol values (average cost and number of solutions) of various literals. Storing a separate 
value for each literal is not practical, for two reasons. The first is the large space required 
by this approach. The second is the lack of generalization: the ordering algorithm is quite 
likely to encounter literals which have not been seen before, and whose control values are 
unknown. Recall that when we transformed Equation 2 into Equation 5, we moved from 
control values of single literals to average control values over sets of literals. To obtain the 
precise averages for these sets, we still needed the control values of individual literals. Here, 
we take a different approach, that of learning and using control values for more general 
classes of literals. The estimated cUst (nsols) value of a class can be defined as the average 
real cost (nsols) value of all examples of this class that were proved in the past. 

The more refined the classes, the smaller the variance of real control values inside each 
class, the more precise the cUst and nsols estimations that the classes assign to their mem- 
bers, and the better orderings we obtain. One easy way to define classes is by modes 
or binding patterns (Debray & Warren, 1988; Ullman & Vardi, 1988): for each argu- 
ment we denote whether it is free or bound. For example, for the predicate father the 
possible classes are father (free , free) , father (bound, free) , father (free , bound) and 
father (bound, bound) . Now, if we receive a literal (for example, f ather(abraham,X)), 
we can easily determine its binding pattern (in this case, father (bound, free)) and re- 
trieve the control information stored for this class. Of course, to find the binding pattern 
of a subgoal with a given binding set, we need a method to determine which variables are 
bound by the subgoals of the binding set. The same problem arose in DPart computation 
(Section 4.3). We shall discuss some practical ways to solve this problem in Section 7.1. 

For the purpose of class definition we can also use regression trees - a type of decision tree 
that classifies to continuous numeric values and not to discrete classes (Breiman et al., 1984; 
Quinlan, 1986). Two separate regression trees can be stored for every program predicate, 
one for its cUst values, and one for the risols. The tests in the tree nodes can be defined 
in various ways. If we only use the test "is argument i bound?", then the classes of literals 
defined by regression trees coincide with the classes defined by binding patterns. But we 
can also apply more sophisticated tests, both syntactic (e.g., "is the third argument a term 
with functor f ?") and semantic (e.g., "is the third argument female?") , which leads to 
more refined classes and better estimations. A possible regression tree for estimating the of 
number of solutions for predicate father is shown in Figure 15. 

Semantic tests about the arguments require logic inference (in the example of Figure 15 
- invoking the predicate female on the first argument of the literal). Therefore, they must 
be as efficient as possible. Otherwise the retrieval of control values will take too much time. 
The problem of efficient learning of control values is further considered elsewhere (Ledeniov 
k Markovitch, 1998a). 

Several researchers applied machine learning techniques for accelerating logic inference 
(Cohen, 1990; Dejong k Mooney, 1986; Langley, 1985; Markovitch k Scott, 1993; Minton, 
1988; Mitchell, Keller, k Kedar-Cabelli, 1986; Mooney k Zelle, 1993; Prieditis k Mostow, 
1987). Some of these works used explanation-based learning or generalized caching tech- 
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Figure 15: A regression tree tliat estimates tlie number of solutions for f ather (argl , arg2) . 



niques to avoid repeated computation. Others utilized the acquired knowledge for the prob- 
lem of clause selection. None of these works, however, dealt with the problem of subgoal 
reordering. 



6. Experimentation 

To test the effectiveness of our ordering algorithm, we experimented with it on various 
domains, and compared its performance to other ordering algorithms. Most experiments 
were performed on randomly created artificial domains. We also tested the performance of 
the system on several real domains. 

6.1 Experimental Methodology 

All experiments described below consist of a training session, followed by a testing session. 
Training and testing sets of queries are randomly drawn from a fixed distribution. In the 
training session we collect the control knowledge for literal classes. In the testing session we 
prove the queries of the testing set using different ordering algorithms, and compare their 
performance using various measurements. 

The goal of ordering is to reduce the time spent by the Prolog interpreter when it 
proves queries of the testing set. This time is the sum of the time spent by the ordering 
procedure {ordering time) and the time spent by the interpreter {inference time). Since the 
CPU time is known to be very sensitive to irrelevant factors such as hardware, software 
and programming quality, we also show two alternative discrete measurements: the total 
number of clause unifications, and the total number of clause reductions performed. The 
number of reductions refiects the size of the proof tree. 

For experimentation we used a new version of the LASSY system (Markovitch & Scott, 
1989), using regression trees for learning, and the ordering algorithms discussed in this 
paper. 
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6.2 Experiments with Artificial Domains 

In order to ensure the statistical significance of the results of comparing different ordering 
algorithms, we experimented with many different domains. For this purpose, we created a 
set of 100 artificial domains, each with a small fixed set of predicates, but with a random 
number of clauses in each predicate, and with random rule lengths. Predicates in the 
rule bodies, and arguments in both rule heads and bodies are randomly drawn from fixed 
distributions. Each domain has its own training and testing sets (these two sets do not 
intersect). 

The more training examples are fed into the system on the learning phase, the better 
estimations of control values it produces. On the other hand, the learning time must be lim- 
ited, because after seeing a certain number of training examples, new examples do not bring 
much new information, and additional learning becomes wasteful. We have experimentally 
built a learning curve which shows the dependence of the quality of the control knowledge 
on the amount of training. The curve suggests that after control values were learned for 
approximately 400 literals, there is no significant improvement in the quality of ordering 
with new training examples. Therefore, in the subsequent experiments we stopped training 
after 600 cost values were learned. The training time was always small: one learned cost 
value corresponds to a complete proof of a literal. Thus, if every predicate in a program has 
four clauses that define it, then 600 cost values are learned after 2400 unifications, which is 
a very small time. 

The control values were learned by means of regression trees (Section 5), with simple 
syntactic tests that only checked whether some argument is bound or whether some argu- 
ment is a term with a certain functor (the list of functors was created automatically when 
the domain was loaded). However, as we shall see, even these simple tests succeeded in 
making good estimations of control values. 

We tested the following ordering methods: 

• Random: The subgoals are permuted randomly and the control knowledge is not 
used. 

• Algorithm 3: Building ordered prefixes. Out of all prefixes that are permutation of 
one another, only the cheapest one is retained. 

• Algorithm 3a: As Algorithm 3, but with best-first search method used to define the 
next processed prefix. A similar algorithm was used in the LASSY system of Markovitch 
and Scott (1989). 

• Algorithm 3b: As Algorithm 3a, but with adjacency restriction test added. A 
similar algorithm was described by Smith and Genesereth (1985). 

• Algorithm 4: As Algorithm 3b, but whenever all the subgoals that are not in the 

prefix are independent (under the binding of the prefix), they are sorted and the result 
is appended to the prefix as one unit. 

• Algorithm 6: The DAC algorithm. 

In our experiments we always used the Bubble-Sort algorithm to sort literals in inde- 
pendent sets. This algorithm is easy to implement, and it is known to be efficient for small 
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Table 4: The effect of ordering on the tree sizes and the CPU time (mean results over 100 artificial 
domains). 



sets, when the elements are already ordered, or nearly ordered. In practice, programmers 
order most program rules optimally, and the sorting stops early. 

Since the non-deterministic nature of the random method introduces additional noise, 
we performed on each artificial domain 20 experiments with this method, and the table 
presents the average values of these measurements. 

Table 4 shows the obtained results over 100 domains: the rows correspond to the ordering 
methods used, and the columns to the measurements taken. The rightmost column shows 
the ratio of the ordering time and the number of reductions performed, which refiects the 
average ordering time of one rule body. The inference time was not measured separately, 
but was set as the difference of the total time and the ordering time. 

Several observations can be made: 

1. Using the DAC ordering algorithm helps to reduce the total time of proving the testing 
set of queries by a factor of 10, compared to the random ordering. The inference time 
is reduced by a factor of 25. 

2. All deterministic ordering methods have similar number of unifications and reductions, 
and similar inference time, which is predictable, since they all find minimal orderings. 
Small fiuctuations of these values can be explained by the fact that some rules have 
several minimal orderings under the existing control knowledge, and different ordering 
algorithms select different minimal orderings. Since the control knowledge is not 
absolutely precise, the real execution costs of these orderings may be different, which 
leads to the differences. The random ordering method builds much larger trees, with 
larger inference time. 

3. When we compare the performance of the deterministic algorithms (3-6), we see 
that the DAC algorithm performs much better than the algorithms that build ordered 
prefixes. In the latter ones, the ordering is expensive, and smaller inference time 
cannot compensate for the increase in ordering time. Only Algorithm 4, a combination 
of several ideas of previous researchers, has total time comparable with the time of 
the random method (though still greater). 
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4. It may seem strange that the simple random ordering method has larger ordering time 
than the sophisticated Algorithm 6. To explain this, note that the random method 
creates much larger proof trees (on average), therefore the number of ordered rules 
increases, and even the cheap operations, like random ordering of a rule, sum up to 
a considerable time. The average time spent on ordering of one rule is shown in the 
last column of Table 4; this value is very small in the random method. 

6.3 Experiments with Real Domains 

We tested our ordering algorithm also on real domains obtained from various sources. These 
domains allow us to compare orderings performed by our algorithm with orderings per- 
formed by human programmers. 

The following domains were used: 

• Moral-reasoner: Taken from the Machine Learning Repository at the University 
of California, Irvine^. The domain qualitatively simulates moral reasoning: whether 
a person can be considered guilty, given various aspects of his character and of the 
crime performed. 

• Depth-first planner: Program 14.11 from the book "The Art of Prolog" (Sterling 
& Shapiro, 1994). The program implements a simple planner for the blocks world. 

• Biblical Family Database: A database similar to that described in Example 1. 

• Appletalk: A domain describing the physical layout of a local computer network 
(Markovitch, 1989). 

• Benchmark: A Prolog benchmark taken from the CMU Artificial Intelligence Reposi- 
tory^. The predicate names are not informative: it is an example of a program where 
manual ordering is difficult. 

• Slow reverse: Another benchmark program from the same source. 

• Geography: Also a benchmark program from the CMU Repository. The domain 
contains many geographical facts about countries. 

Table 5 shows the results obtained. For ordering we used the DAC algorithm, with literal 
classes defined by binding patterns. It can be seen that the DAC algorithm was able to speed 
up the logic inference in real domains as well. Note that in the Slow Reverse domain the 
programmer's ordering was already optimal; thus, applying the ordering algorithm did not 
reduce the tree sizes. Still, the overhead of the ordering is not significant. 

7. Discussion 

In this concluding section we discuss several issues concerning the practical implementation 
of the DAC algorithm and several ways to increase its efficiency. Then we survey some 
related areas of logic programming and propose the use of the DAC algorithm there. 

1. URL: http://www.ics.uci.edu/~mleam/MLRepository.html 

2. URL: http://www.cs.cmu.edu/afs/cs.cmu.edu/project/ai-repository/ai/html/air.html 
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Table 5: Experiments on real domains. 



7.1 Practical Issues 

In this subsection we would like to address several issues related to implementation and 
applications of the DAC algorithm. 

The computation of the DPart function (Section 3.2.1) requires a procedure for com- 
puting the set of variables bound by a given binding set of subgoals. The same procedure 
is needed for computing control values (Section 5). There are several possible ways to 
implement such a procedure. For example: 

1. The easiest way is to assume that every subgoal binds all the variables appearing in its 
arguments. This simplistic assumption is sufficient for many domains, especially the 
database-oriented ones. However, it is not appropriate when logic programs are used 
to manipulate complex data structures containing free variables (such as difference 
lists). This assumption was used for the experiments described in Section 6. 

2. Some dialects of Prolog and other logic languages support mode declarations provided 
by the user (Somogyi et al., 1996b). When such declarations are available, it is easy 
to infer the binding status of each variable upon exiting a subgoal. 

3. Even when the user did not supply enough mode declarations, they can often be 
inferred from the structure of the program by means of static analysis (Debray & 
Warren, 1988). Note, however, that as was pointed out by Somogyi et al. (1996b), 
no-one has yet demonstrated a mode inference algorithm that is guaranteed to find 
accurate mode information for every predicate in the program. 

4. We can learn the sets of variables bound by classes of subgoals using methods similar 
to those described in Section 5 for learning control values. 

Several researchers advocate user declarations of available (permitted) modes. Such 
declarations can be elegantly incorporated into our algorithm to prune branches that violate 
available modes. When we fix a binder in an OR-node, we compute the set of variables 
that become bound by it. If this results in a violation of an available mode for one of the 
subgoals of the corresponding child, then the whole subtree of this child is pruned. Note 
that we can detect violations even when the mode of the subgoal is partially unknown 
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CandidateSet((5, B) 

let {Si,S2, ...Sk}^ DPart{S, B) 
case 

• A; = 1, shared-vars{Si) / (5 is indivisible under B): 
loop for A £ S 

if S U {A} does not violate available modes 

in any subgoal of 5 \ {A} 
then 

let e{A) ^ CandidateSet(5 \ {A}, B U {A}) 
let C'{A) ^ \Fo\d{A\\dA, B) 6 a G } 
else let C {A) <— (don't enter the branch) 
Return [jAesC'{A) 



Figure 16: Changes to Algorithm 6 that make use of available mode declarations. 
The rest of the algorithm remains unchanged. 



at the moment. For example, if all the available modes require that the first argument 
be unbound, then binding of the argument by the OR-node binder will trigger pruning, 
even if the binding status of the other arguments is not yet known. Figure 16 shows how 
Algorithm 6 can be changed in order to incorporate declarations of available modes. Any 
other correctness requirement can be treated in a similar manner: a candidate ordering will 
be rejected whenever we see that it violates the requirement. 

The experiments described in Section 6 were performed with a Prolog interpreter. Is 
it possible to combine the DAC algorithm with a Prolog compiler? There are several ways 
to achieve this goal. One way is to allow the compiler to insert code for on-line learning. 
The compiled code will contain procedures for accumulating control values and for the DAC 
algorithm. Alternatively, off-line learning can be implemented, with training as a part of 
the compilation process. 

Another method for combining our algorithm with existing Prolog compilers is to use 
it for program transformation, and to process the transformed program by a standard 
compiler. Elsewhere (Ledeniov & Markovitch, 1998a) we describe the method for classifying 
the orderings produced by the DAC algorithm. For each rule we build a classification tree, 
where classes are the different orderings of the rule body, and the tests are applied to the 
rule head arguments. These are the same type of tests described in Section 5 for learning 
control values. Figure 17 shows two examples of such trees. 

Given such a classification tree, we can write a set of Prolog rules, where each rule has 
the same head as the original rule, and has a body built of all the tests on the path from 
the tree root to a leaf node followed by the ordering at the leaf. For example, the second 
tree in Figure 17 yields the following set of rules: 
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A classification tree for the rule 
uncle(X,Y) of Example 1. 



nonvar(Y) ? 

[parent (Z, Y) , brother (Z,X) ] 

[brother (Z, X) , parent {Z, Y) ; 



nonvarPO ? 



male(X) ? 



nonvar(Y] ? 



A possible classification tree for the rule 

head(X,Y) ^ pl(X) , p2(Y), p3(X,Y). \^ ^ 

[pl(X),p3(X,Y),p2(Y)] 1 [p2(Y),p3(X,Y),pl(X)] 

[pi (X) ,p2 (Y) ,p3 (X,Y) ] [p3 (X,Y) ,pl (X) ,p2 (Y) ] 



Figure 17: Examples of classification trees that learn rule body orderings 



head(X,Y) ^ nonvar(X) , male(X), pl(X), p3(X,Y), p2(Y) . 

head(X,Y) ^ nonvar(X) , not (male (X) ) , pl(X), p2(Y), p3(X,Y) . 

head(X,Y) ^ var(X) , nonvar(Y) , p2(Y), p3(X,Y), pl(X). 

head(X,Y) ^ var(X) , var(Y) , p3(X,Y), pl(X), p2(Y) . 



From Table 4 we can see that while the DAC algorithm helped to reduce the inference 
time by a factor of 25, the total time was reduced only by a factor of 10. This difference 
is caused by the additional computation of the ordering procedure. There is a danger that 
the benefit obtained by ordering will be outweighed by the cost of the ordering process. 
This is a manifestation of the so-called utility problem (Minton, 1988; Markovitch & Scott, 
1993). In systems that are strongly-moded (such as Mercury - Somogyi et al., 1996b) we can 
employ the DAC algorithm statically at compilation time for each one of the available modes, 
thus reducing the run-time ordering time to zero. The mode-based approach performs only 
syntactic tests of the subgoal arguments. The classification tree method, described above, 
is a generalization of the mode-based approach, allowing semantic tests as well. 

Due to insufficient learning experience or lack of meaningful semantic tests, it is quite 
possible that the classification trees contain leaves with large degrees of error. In such cases 
we still need to perform the ordering dynamically. To reduce the harmfulness of the utility 
problem in the case of dynamic ordering, we can use a cost-sensitive variation of the DAC 
algorithm (Ledeniov & Markovitch, 1998a, 1998b). This modified algorithm deals with the 
problem by explicit reasoning about the economy of the control process. The algorithm is 
anytime, that is, it can be stopped at any moment and return its currently best ordering 
(Boddy & Dean, 1989). We learn a resource-investment function to compute the expected 
return in speedup time for additional control time. This function is used to determine a 
stopping condition for the anytime procedure. We have implemented this framework and 
found that indeed we have succeeded in reducing ordering time, without significant increase 
of inference time. 
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7.2 Relationship to Other Works 

The work described in this paper is a continuation of the line of research initiated by Smith 
and Genesereth (1985) and continued by Natarajan (1987) and Markovitch and Scott (1989). 
This line of research aims at finding the most efficient ordering of a set of subgoals. The 
search for minimal-cost ordering is based on cost analysis that utilizes available information 
about the cost and number-of-solutions of individual subgoals. 

Smith and Genesereth (1985) performed an exhaustive search over the space of all 
permutations of the given set of subgoals, using the adjacency restriction to reduce the 
size of the search space (Equation 8). This restriction was applied on pairs of adjacent 
subgoals in the global ordering of the entire set. When applied to an independent set of 
subgoals, the adjacency restriction is easily transformed into the sorting restriction: the 
subgoals in a minimal ordering must be sorted by their era values. Natarajan (1987) arrived 
at this conclusion and presented an efficient ordering algorithm for independent sets. 

The DAC algorithm uses subgoal dependence to break the set into smaller subsets. In- 
dependent subsets are sorted. Dependent subsets are recursively ordered, and the resulting 
orderings are merged using a generalization of the adjacency restriction that manipulates 
blocks of subgoals. Therefore the DAC algorithm is a generalization of both algorithms. 

During the last decade, a significant research effort went into static analysis (SA) of 
logic programs. There are three types of SA that can be exploited by the DAC algorithm to 
reduce the ordering time. 

A major part of the SA research deals with program termination (De Schreye & Decorte, 
1994). The dac algorithm solves the termination problem, as a special case of the efficiency 
problem (it always finds a terminating ordering, if such orderings exist). During learning, 
we set limits on the computation resources available for subgoal execution. If a subgoal is 
non-terminating (in a certain mode), the learning module will associate a very high cost 
with this particular mode. Consequently, the DAC algorithm will not allow orderings with 
this mode of the subgoal. Nevertheless, while the use of static termination analysis is 
not mandatory for a proper operation of the DAC algorithm, we can exploit such analysis 
to increase the efficiency of both the learning process and the ordering process. During 
learning, the limit that we set on the computation resources devoted to the execution of 
a subgoal must be high, to increase the reliability of the cost estimation. However, such 
a high limit can lead to a significant increase in learning time when many subgoals are 
non-terminating. If termination information obtained by SA is available, we can use it to 
avoid entering infinite branches of proof trees. During ordering, termination information can 
serve to reduce the size of space of orderings searched by the algorithm. If the termination 
information comes in the form of allowed modes (Somogyi et al., 1996b), orderings that 
violate these modes are filtered out, as in the modified algorithm shown in Figure 16. If the 
termination information comes in the form of a partial order between subgoals, orderings 
that violate this partial order can be filtered out in a similar manner. 

The second type of SA research that can be combined with the DAC algorithm is cor- 
rectness analysis, where the program is tested against specifications given by the user. 

The FOLON environment (Henrard & Le Charlier, 1992) was designed to support the 
methodology for logic program construction that aims at reconciling the declarative seman- 
tics with an efficient implementation (Deville, 1990). The construction process starts with 
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a specification, converts it into a logic description and finally, into a Prolog program. If 
the rules of the program are not correct with respect to the initial specification, the sys- 
tem performs transformations such as reordering literals in a clause, adding type checking 
literals and so on. De Boeck and Le Charlier (1990) mention this reordering, but do not 
specify an ordering algorithm different from the simple generate-and-test method. Cortesi, 
Le Charlier, and Rossi (1997) present an analyzer for verifying the correctness of a Prolog 
program relative to a specification which provides a list of input/output annotations for the 
arguments and parameters that can be used to establish program termination. Again, no 
ordering algorithm is given explicitly. The purpose of the DAC algorithm is complementary 
to the purpose of FOLON, and it could serve as an auxiliary aid to make the resulting Prolog 
program more efficient. 

Recently, the Mercury language was developed at the University of Melbourne (Somogyi 
et al., 1996a, 1996b). Mercury is a strongly typed and strongly moded language. Type and 
mode declarations should be supplied by the programmer (though recent releases of the 
Mercury system already support partial inference of types and modes - Somogyi et al., 
1996a). The compiler checks that mode declarations for all predicates are satisfied; if 
necessary, it reorders subgoals in the rule body to ensure mode correctness (and rejects the 
program if neither ordering satisfies the mode declaration constraints). When the compiler 
performs this reordering, it does not consider the efficiency issue. It often happens that 
several orderings of a rule body satisfy the mode declaration constraints: in such cases 
the Mercury compiler could call the static version of the DAC algorithm to select the most 
efficient ordering. Another alternative is to augment the DAC algorithm by mode declaration 
checks, as was shown in Figure 16. 

Note that Mercury is a purely declarative logic programming language, and is therefore 
more suitable for subgoal reordering than Prolog. It has no non-logical constructs that 
could destroy the declarative semantics which give logic programs their power; in Mercury 
even I/O is declarative. 

The third type of relevant SA is the cost analysis of logic programs (Debray & Lin, 
1993; Braem et al., 1994; Debray et al., 1997). Cortesi et al. (1997) describe a cost formula 
similar to Equation 5 to select a lowest-cost ordering. However, they used a generate-and- 
test approach which can sometimes be prohibitively expensive. Static analysis of cost and 
number of solutions can be used to obtain the control values, instead of learning them. 

The efficiency of logic programs can also be increased by methods of program trans- 
formation (Pettorossi & Proietti, 1994, 1996). One of the most popular approaches is the 
"rules-|-strategies" approach, which consists in starting from an initial program and then 
applying one or more elementary transformation rules. Transformation strategies are meta- 
rules which prescribe suitable sequences of applications of transformation rules. 

One of the possible transformation rules is the goal rearrangement rule which transforms 
a program by transposing two adjacent subgoals in a rule body. Obviously, any ordering 
of a rule body can be transformed into any other ordering by a finite number of such 
transpositions. Thus, static subgoal ordering can be considered a special case of program 
transformation where only the goal rearrangement rule is used. On the other hand, dynamic 
and semi-dynamic ordering methods cannot be represented by simple transformation rules, 
since they make use of run-time information (expressed in bindings that rule body subgoals 



85 



Ledeniov & Markovitch 



obtain through unifications of rule heads), and may order the same rule body differently 
under different circumstances. 

A program transformation technique called compiling control (Bruynooghe, De Schreye, 
& Krekels, f989; Pettorossi & Proietti, f994) follows an approach different from that of 
trying to improve the control strategy of logic programs. Instead of enhancing the naive 
Prolog evaluator using a better (and often more complex) computation rule, the program is 
transformed so that the derived program behaves under the naive evaluator exactly as the 
initial program would behave under an enhanced evaluator. Most forms of compiling control 
first translate the initial program into some standard representation (for example, into an 
unfolding tree), while the complex computation rule is used, and then the new program is 
constructed from this representation, with the naive computation rule in mind. 

Reordering of rule body subgoals can be regarded as moving to a complex computation 
rule which selects subgoals in the order dictated by the ordering algorithm. In the case of 
the DAC algorithm, this computation rule may be too complex for simple use of compiling 
control methods. Nevertheless, it can be easily incorporated into a special compiling control 
method. In Section 7.1 we described a method of program rewriting which first builds 
classification trees based on the orderings that were performed in the past, and then uses 
these classification trees for constructing clauses of a derived program. The derived program 
can be efficiently executed under the naive computation rule of Prolog. This technique is 
in fact a kind of compiling control. Its important property is the use of knowledge collected 
from experience (the orderings that were made in the past). 

One transformation method that can significantly benefit from the DAC algorithm is 
unfolding (Tamaki & Sato, 1984). During the unfolding process subgoals are replaced by 
their associated rule bodies. Even if the initial rules were ordered optimally by a human 
programmer or a static ordering procedure, the resulting combined sequence may be far from 
optimal. Therefore it could be very advantageous to use the DAC algorithm for reordering 
of the unfolded rule. As the rules become longer, the potential benefit of ordering grows. 
The danger of high complexity of the ordering procedure can be overcome by using the 
cost-sensitive version of the DAC algorithm (Section 7.1). 

7.3 Conclusions 

In this work we study the problem of subgoal ordering in logic programs. We present both a 
theoretical base and a practical implementation of the ideas, and show empirical results that 
confirm our theoretical predictions. We combine the ideas of Smith and Genesereth (1985), 
Simon and Kadane (1975) and Natarajan (1987) into a novel algorithm for ordering of 
conjunctive goals. The algorithm is aimed at minimizing the time which the logic interpreter 
spends on the proof of the given conjunctive goal. 

The main algorithm described in this paper is the DAC algorithm (Algorithm 6, Sec- 
tion 4.6). It works by dividing the sets of subgoals into smaller sets, producing candidate 
sets of orderings for the smaller sets, and combining these candidate sets to obtain orderings 
of the larger sets. We prove that the algorithm finds a minimal ordering of the given set 
of subgoals, and we show its efficiency under practical assumptions. The algorithm can 
be employed statically (to reorder rule bodies in the program text before the execution 
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starts), semi-dynamically (to reorder the rule body before the reduction is performed) or 
dynamically (to reorder the resolvent after every reduction of a subgoal by a rule body). 

Several researchers (Minker, 1978; Warren, 1981; Naish, 1985a, 1985b; Nie k Plaisted, 
1990) proposed various heuristics for subgoal ordering. Though fast, these methods do 
not guarantee finding minimal-cost orderings. Our algorithm provably finds a minimal-cost 
ordering, though the ordering itself may take more time than with the heuristic methods. In 
the future it seems promising to incorporate heuristics into the DAC algorithm. For example, 
heuristics can be used to grade binders in OR-nodes: rather than exhaustively trying all 
subgoals as binders, we could try just one, or several binders, thus reducing the ordering 
time. Also, the current version of our ordering algorithm is suitable only for finding all 
solutions to a conjunctive goal. We would like to extend it to the problem of finding one 
solution, or a fixed number of solutions. 

Another interesting issue for further research is the adaptation of the DAC algorithm to 
interleaving ordering methods (Section 2.3). There, if subgoals of a rule body are added 
to an ordered resolvent, it seems wasteful to start a complete ordering process; we should 
use the information stored in the existing ordering of the resolvent. Perhaps the whole 
divisibility tree of the resolvent should be stored, and its nodes updated when subgoals of 
a rule body are added to the resolvent. 

The ordering algorithm needs control knowledge for its work. This control knowledge is 
the average cost and number of solutions of literals, and it can be learned by training and 
collecting statistics. We make an assumption that the distribution of queries received by 
the system does not change with time; thus, if the training set is based on the distribution 
seen in the past, the system learns relevant knowledge for future queries. We consider the 
issue of learning control values more thoroughly in another paper (Ledeniov & Markovitch, 
1998a), together with other issues concerning the DAC algorithm (such as minimizing the 
total time, instead of minimizing the inference time only). 

Ullman and Vardi (1988) showed that the problem of ordering subgoals to obtain ter- 
mination is inherently exponential in time. The problem we work with is substantially 
harder: we must not only find an order whose execution terminates in finite time, but one 
that terminates in minimal finite time. It is impossible to find an efficient algorithm for 
all cases. The DAC algorithm, however, is efficient in most practical cases, when the graph 
representing the subgoal dependence (Figure 3) is sparsely connected. 

We have implemented the DAC algorithm and tested it on artificial and real domains. 
The experiments show a speedup factor of up to 10 compared with random ordering, and 
up to 13 compared with some alternative ordering algorithms. 

The DAC algorithm can be useful for many practical applications. Formal hardware 
verification has become extremely important in the semiconductor industry. While model 
checking is currently the most widely used technique, it is generally agreed that coping with 
the increasing complexity of VLSI design requires methods based on theorem proving. The 
main obstacle preventing the use of automatic theorem proving is its high computational 
demands. The DAC algorithm may be used for speeding up logic inference, making the use 
of automatic theorem provers more practical. 

Logic has gained increasing popularity for representation of common-sense knowledge. 
It has several advantages, including fiexibility and well-understood semantics. Indeed, the 
CYC project (Lenat, 1995) has recently moved from frame-based representation to logic- 
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based representation. However, the large scale of such knowledge bases is likely to present 
significant efficiency problems to the inference engines. Using automatic subgoal ordering 
techniques, such as those described here, may help to solve these problems. 

The issue of subgoal ordering obtains a new significance with the development of Induc- 
tive Logic Programming (Lavrac & Dzeroski, 1994; Muggleton & De Raedt, 1994). Systems 
using this approach, such as FOIL (Quinlan & Cameron-Jones, 1995), try to build correct 
programs as fast as possible, without considering the efficiency of the produced programs. 
Combining the DAC algorithm with Inductive Logic Programming and other techniques for 
the synthesis of logic programs (such as the deductive and the constructive approaches) 
looks like a promising direction. 

Appendix A. Proof of Lemma 8 

In this appendix we present the proof of a lemma which was omitted from the main text of 
the paper for reasons of compactness. Before we prove it we show two auxiliary lemmas. 

Lemma 9 

Let Ai and A2 be two ordered sequences of subgoals, and B a set of subgoals. The value of 
cn{Ai\\A2)\B liss between the values cn{Ai)\g and cra(/l2) Ig^^^ . 

Proof: 

Denote ci = c7)st{Ai)\g ni = n'sdls{Ai)\g cni = cn{Ai)\g 

C2 = cos^(i*2)lBuii ^2 = nsols{A2)\^^^^ cn2 = cn(i*2)lB,uii 

ci,2 = cost{Ai\\A2)\B ni^2 = nsdls{Ai\\A2)\B cni^2 = cn{Ai\\A2)\B 



rai, 2 - 1 _ nin2 - 1 _ (rai - 1) -|- rai(ra2 - 1) _ 

Cl,2 Ci,2 Ci,2 

ci— l-raiC2-— cicrai -I- raiC2cra2 ci raiC2 

= — = = • crai -| • cra2 

Cl,2 Ci,2 Ci,2 Ci,2 

So, crai, 2 always lies between crai and cra2 (because and positive and sum 

to 1). More exactly, the point crai, 2 divides the segment [crai,cra2] with ratio 

(crai, 2 - crai) : (cra2 - crai, 2) = raiC2 : ci. 

In other words, crai, 2 is a weighted average of crai and cra2. Note that ci is the amount 
of resources spent in the proof-tree of _Bi, raiC2 - the resources spent in the tree of B2, and 
ci,2 is their sum. So, the more time (relatively) we dedicate to the proof of Bi, the closer 
crai, 2 is to crai. This conclusion can be generalized to a larger number of components in a 
concatenation (the proof is by induction): 

cn{MA,\\...A,)\s = . cn{A,)\s + 

cost{Ai\\A2\\ . ..AkjlB 

nsols{Ai)\B ■ c—st{A2)\^^jX, , r 

c-stiA,\\A2\\...A,)\s ^ '^"''^^ 

n-sdls{Ai\\A2\\ . ..Ak-i)\B ■ cost{Ak)\g^J^^^J 



cost(Ai\\A2\\ . . . Ak)\i3 
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□ 

Lemma 10 

Let So be a set of suhgoals and N he a node in the divisibility tree of Sq. Let On = 
Q\\Ai\\A2\\R be an ordering of S{N) , where Ai and A2 are cn- equal max-blocks: Ig^j^^i^j^ = 

^"'(^2)|g(jv)u(9uir 

Let M be an ancestor of N and Om be an ordering of S{M) consistent with On, where 
Ai and A2 are not violated. Then either Ai and A2 are both max-blocks in Om (ind all 
max-blocks that stand between them are cn-equal to them, or Ai and A2 belong to the same 
max-block in Om, or Om is MC-contradicting. 

Proof: By induction on the distance between N and M. \i M = N ^ then Ai and A2 
are max-blocks, and the lemma holds. Let M ^ N , and let M' be the child of M whose 
descendant is N . By inductive hypothesis, the lemma holds for N and M' . Let be the 
projection of Om on M' . Ai and A2 are not violated in O'mi since they are not violated in 
Om. 

• If Ai and A2 are both max-blocks in O'mi then by the inductive hypothesis all max- 
blocks that stand between them are cn-equal to them. If M is an OR-node, no new 
subgoals can enter between Ai and A2. If M is an AND-node, the insertion of new 
subgoals is possible, but if it violates blocks, or places max-blocks not ordered by cn, 
then Om is MC-contradicting, by Corollary 3 or Lemma 6. So, if Om is not MC- 
contradicting, then all new max-blocks inserted between Ai and A2 must be cn-equal 
to them both. 

Assume that Ai and A2 are not both max-blocks in Om- Without loss of generality, 
let Ai be member of a larger max-block in Om- We show that A2 must also participate 
in the same max-block. 

Since Ai joined a larger block, there must exist another block, B, adjacent to Ai, 
such that their pair is cn-inverted. Let B stand to the left of Ai (in the opposite case, 
the proof is similar): Om = -^^11-^11^111^11^211^- The pair {B, Ai) is cn-inverted, i.e., 
c'^(^)Ib(m)ux > c'^(^i)Ib(m)uXub- From Lemma 9, cn(5||ll)|g(J^^)^^ > cn{Ai)\^^^^^^ji^g, 
and we must add to the block -B||Ai all blocks from Y, because they are all cn-equal 
to Ai. Also, cra(i*l)|g^J^^J^^^J| = cra(i*2)lB(M)uXuBuii' ^^^^^ 
to the block. Thus, Ai and A2 belong to the same max-block in Om- 

• If Ai and A2 belong to the same max-block in O'mi then this block is either violated 
in Om, or not. In the former case, Om is MC-contradicting, by Corollary 3. In the 
latter case, Ai and A2 belong to the same max-block in Om- 

• If O'j^ is MC-contradicting, then Om is MC-contradicting too (the proof is easy). □ 
Now we can prove Lemma 8: 

Lemma 8 

Let So be a set of subgoals, be a node in the divisibility tree of Sq and On = Q||^i ||^2||-R 
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be an ordering of S{N), where Ai and A2 are max-blocks, mutually independent and 
cn-equal under the bindings of B{N) U Q. Then Ojv is blockwise-equivalent with O'j^ = 

Proof: 

Let S* be a minimal ordering of Sq binder-consistent with On. By Corollary 3, S does not 

violate the blocks of On, in particular Ai and A2: S = X||/li||y||/l2||^. Let S' = S\ = 
^ ^ ^ ^ ^ ^ '^^ 

X\\A2\\Y\\Ai\\Z . We must show that S' is minimal, which implies blockwise equivalence of 

On and O^. 

If y is empty, then Cost(S) = Cost(S') by Lemma 2 (Ai and A2 are adjacent, mutually 
independent and cn-equal; thus, their transposition does not change the cost). 

If Y is not empty, then by Corollary 2 y is mutually independent of both Ai and A2 
{S is binder-consistent with On, therefore B{N) C X, and consequently Y fl B{N) = 0). 
y can be divided into several blocks, each one of them cn-equal to Ai and A2. since S 
is minimal. On cannot be MC-contradicting, and the claim follows from Lemma 10. By 
Lemma 9, cn{Y)\j^ = cn{Ai)\j^ = cra(/l2)|^. By Lemma 2: 

Cost{S) = Cost{X\\Ai\\Y\\A2\\Z) = / / swap{Y, A2) 

= Cost{X\\Ai\\A2\\Y\\Z) = // swap{Ai,A2) 

= Cost{X\\A2\\Ai\\Y\\Z) = //swap{Ai,Y) 

= Cost{X\\A2\\Y\\Ai\\Z) = Cost{S') 

Minimality of S' implies blockwise equivalence of On and O'j^. □ 
Appendix B. Correctness of the DAC Algorithm 

In this section we show that the DAC algorithm is correct, i.e., given a set of subgoals Sq, 
it returns its minimal ordering. It suffices to show that the candidate set of the root node 
of DTree{So, 0) is valid. In such a case, as follows from the definition of valid sets, it must 
contain a minimal ordering. The algorithm returns one of the cheapest candidates of the 
root. Therefore, if the candidate set of the root is valid, the DAC algorithm must return a 
minimal ordering of Sq. 

We start by defining strong validity of sets of orderings. We then prove that strong 
validity implies validity. Finally, we use induction to prove a theorem, showing that the 
candidate set produced for each node in the divisibility tree is strongly valid. 

Definition: Let Sq be a set of subgoals, be a node in the divisibility tree of Sq. The set 

Cn C 7r{S{N)) is strongly valid, if every ordering in 7r{S{N))\CN is either MC-contradicting 
or blockwise-equivalent to some member of Cn, unless no ordering of S{N) is min-consistent. 

StronglyValidN,So{CN) 

[30'^ e 7r{S{N)) : MCn,So{0'^)] ^ [On G 7r(5(7V)) \ Cjv ^ MCCjv,5o (Ojv) V 

(30^ G Cjv A MCEn,So{On, O^))] 

Lemma 11 A strongly valid set of orderings is valid. 
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Proof: Let Sq be a set of subgoals, be a node in the divisibility tree of Sq, C{N) be a 
strongly valid set of orderings of N . 

If there is no min-consistent ordering of N, then C{N) is valid, by the definition of a 
valid set (Section 4.2). 

Otherwise, there exists at least one minimal ordering of Sq, binder-consistent with N . 
Every ordering in 7r{S{N)) \ C{N) is either MC-contradicting or blockwise-equivalent to 
some member of C{N). To prove that C{N) is valid, we must show that it contains an 
ordering On, which is binder-consistent with some minimal ordering S of Sq. 

Let S' be a minimal ordering of Sq, binder-consistent with N . Let O'jy be the projection 
of S' on N. If O'j^ e C{N), we are done {On = d'^,S = S'). Otherwise, 6'^ G t^{S{N)) \ 
C{N). O'jy cannot be MC-contradicting (it is min-consistent to S'), therefore it must be 

blockwise-equivalent to some O'^ G C{N). Blocks of O'j^ are not violated in S*', since S' is 

. . ^ ^ d" . ^ . . . 

minimal (Corollary 3). Therefore the substitution S" = 5"!^^^ is well defined. S" is minimal, 

since S' is minimal and O'jy and are blockwise-equivalent. S" is binder-consistent with 
O^, since S' was binder-consistent with O^. Thereupon S" and O'^ satisfy the requirements 
of validity (djv = d^, 5 = S") . □ 

Theorem 3 

Let So be a set of subgoals. For each node N of the divisibility tree of Sq, Algorithm 6 
creates a strongly valid candidate set of orderings. 

Proof: By induction on the height of A^'s subtree. 

Inductive base: A^ is a leaf node, which means that S{N) is independent under B{N). 
The candidate set of A^ contains one element, whose subgoals are sorted by era. All 
orderings that belong to 'k{S{N)) \ CandSet{N) are either not sorted by era, and 
hence are MC-contradicting (Lemma 4), or are sorted by era, and hence are blockwise- 
equivalent to the candidate (Corollary 4). Consequently, CandSet{N) is strongly 
valid. 

Inductive hypothesis: For all children of A^, Algorithm 6 produces strongly valid candi- 
date sets. 

Inductive step: An internal node in a divisibility tree is either an AND-node or an OR- 
node. 

1. A^ is an AND-node. Let A^i, N2, ■ ■ -N^ be the children of A^. First we show that 
ConsSet{N) is strongly valid. 

Let On G Tr(S{N)) \ ConsSet(N). For all 1 < i < A;, let Oi be the projection of 
On on Ni. The set of projections {Oi, O2, ■ ■ -Ok} can belong to one of the three 
following types, with regard to On- 

(a) The sets of the first type contain at least one MC-contradicting projection. In 
such a case On is MC-contradicting too. Assume the contrary: there exists 
a minimal ordering S of Sq, binder-consistent with On- Let Oi be an MC- 
contradicting projection. Since Oi is consistent with On, it is also consistent 
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with S. Since B[Ni) = B{N), all subgoals of B[Ni) appear in S before 
subgoals of S{Ni). Therefore, Oi is binder-consistent with S*, and since S is 
minimal, Oi is min-consistent and not MC-contradicting - a contradiction. 

(b) The sets of the second type do not contain MC-contradicting projections, but 
in Ojv some block of some projection is violated, or max-blocks from different 
projections are not ordered by era. In such a case, Ojv is MC-contradicting, 
by Corollary 3 and Lemma 6. 

(c) The sets of the third type do not contain MC-contradicting projections, and 
max-blocks of the projections are not violated in Ojv and are sorted by era. 
Every projection Oi either belongs to CandSet{Ni) , or not. If Oi ^ CandSet{Ni) , 
then there exists O'- G CandSet{Ni) such that Oi is blockwise-equivalent to 

O'- (because CandSet{Ni) is strongly valid by the inductive hypothesis, and 
Oi is not MC-contradicting). If Oi G CandSet{Ni) , we can set O'- = Oi. 

-, -, Qt Qt Ql _ ... . -> 

Let O'aj = On\ . . . I This substitution is well defined, since each Oi 
N '"'01102 ^ It 

has the same number of max-blocks as O'-, and max-blocks of the projections 

are not violated in On. Let S* be a minimal ordering of Sq, binder-consistent 

with On. Since S is minimal, blocks of Oi are not violated in S. Since Oi 

is blockwise-equivalent to O^, the ordering 5*1 = 5*1^^ is well-defined and 

minimal. In Si the positions of the subgoals from B{N) did not change; 
thus, O2 is min-consistent with 5*1, and blockwise equivalence of O2 and O2 

. ^ ^ id' ^id' id' 

entails minimality of the ordering S2 = 5*11;*^ = S* -.M We continue with 

O2 Oi O2 

^ . ^ ^ d' d' d' . . . 

other Oj'-s, and finally obtain that S' = S\J\J •••!/=;* is minimal. From the 

Oi O2 Ok 

. . ^ ^ ^ d' 

definition of O^, S' = S\^^ (note that we introduced blockwise equivalence 

On _^ 

and strong validity only to be able to perform this transition). S' is minimal, 
therefore On is blockwise-equivalent to O^. G ConsSet{N) , since all its 
projections are candidates of the child nodes. Thereupon, On is blockwise- 
equivalent to a member of ConsSet{N) . 

So, ConsSet{N) is strongly valid. To prove that CandSet{N) is strongly valid, 
it suffices to show that all the members of ConsSet{N) that are not included in 
CandSet{N) by Algorithm 6, are either MC-contradicting or blockwise-equivalent 
to members of CandSet{N) . Such orderings can be of three types: 

(a) Orderings that violate blocks of the children projections. They are MC- 
contradicting by Corollary 3. 

(b) Orderings that do not violate blocks, but where max-blocks of children pro- 
jections are not ordered by era. They are MC-contradicting by Lemma 6. 

(c) Orderings that do not violate blocks and have them sorted by era. For each 
combination of projections, one consistent ordering of N is retained in the 
candidate set, and all the other are rejected. By Corollary 5, the rejected 
orderings are blockwise-equivalent to the retained candidate. 

Consequently, CandSet{N) is strongly valid. 
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2. N is an OR-node. Again, we start with showing that ConsSet{N) is strongly 
valid. 

Let Ojv G K[S{N))\ConsSet[N). Ojv is constructed from a binder H and a 
"tail" sequence T: On = H\\T. Let Nfj be the child of N that corresponds 
to the binder H . By the inductive hypothesis, CandSet^Nfj) is strongly valid. 
T ^ CandSet{NH) , since otherwise On G ConsSet{N) . Therefore, T is ei- 
ther MC-contradicting, or blockwise-equivalent to some T' G CandSet^Nfj) ■ 
If T is MC-contradicting, Ojv is MC-contradicting too (proof by contradic- 
tion, as for AND-nodes). If T is blockwise-equivalent to T' , then On = H\\T 
is blockwise-equivalent to H\\T' G ConsSet{N) (the proof is easy). Hence, 
ConsSet{N) is strongly valid. The only orderings of ConsSet{N) that are not in- 
cluded in CandSet{N) by the DAC algorithm have cheaper permutations of their 
leading max-blocks, and therefore are MC-contradicting, by Lemma 7. Hence, 
CandSet{N) is strongly valid. □ 

Corollary 7 The candidate set found by Algorithm 6 for the root node is valid. 

Corollary 8 Algorithm 6 finds a minimal ordering of the given set of suhgoals. 
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